Sparse autoencoders applied to frozen dense retrievers extract Zipfian latent vocabularies that support BM25 scoring and match or exceed the base model's performance on some tasks.
Ben He and Iadh Ounis
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies
Sparse autoencoders applied to frozen dense retrievers extract Zipfian latent vocabularies that support BM25 scoring and match or exceed the base model's performance on some tasks.