Sparse Overcomplete Word Vector Representations

Chris Dyer; Dani Yogatama; Manaal Faruqui; Noah Smith; Yulia Tsvetkov

arxiv: 1506.02004 · v1 · pith:2H2TU4BLnew · submitted 2015-06-05 · 💻 cs.CL

Sparse Overcomplete Word Vector Representations

Manaal Faruqui , Yulia Tsvetkov , Dani Yogatama , Chris Dyer , Noah Smith This is my paper

classification 💻 cs.CL

keywords vectorsrepresentationssparsetheywordautomaticallybecausebenchmark

0 comments

read the original abstract

Current distributed representations of words show little resemblance to theories of lexical semantics. The former are dense and uninterpretable, the latter largely based on familiar, discrete classes (e.g., supersenses) and relations (e.g., synonymy and hypernymy). We propose methods that transform word vectors into sparse (and optionally binary) vectors. The resulting representations are more similar to the interpretable features typically used in NLP, though they are discovered automatically from raw corpora. Because the vectors are highly sparse, they are computationally easy to work with. Most importantly, we find that they outperform the original vectors on benchmark tasks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
cs.AI 2026-05 unverdicted novelty 6.0

Sparse autoencoders scaled to 34 million features on Claude 3 Sonnet yield interpretable, steerable representations of concrete and abstract concepts that generalize across languages and modalities.