SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval

Chaoliang Zhang; Fangshan Wang; Gang Wang; Jun Xu; Lifeng Shang; Qun Liu; Xiaoguang Li; Yang Bai; Zhaowei Wang

arxiv: 2010.00768 · v1 · pith:GWX4O7GZnew · submitted 2020-10-02 · 💻 cs.IR

SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval

Yang Bai , Xiaoguang Li , Gang Wang , Chaoliang Zhang , Lifeng Shang , Jun Xu , Zhaowei Wang , Fangshan Wang

show 1 more author

Qun Liu

This is my paper

classification 💻 cs.IR

keywords sparsespartermtextrepresentationrepresentationstermterm-basedframework

0 comments

read the original abstract

Term-based sparse representations dominate the first-stage text retrieval in industrial applications, due to its advantage in efficiency, interpretability, and exact term matching. In this paper, we study the problem of transferring the deep knowledge of the pre-trained language model (PLM) to Term-based Sparse representations, aiming to improve the representation capacity of bag-of-words(BoW) method for semantic-level matching, while still keeping its advantages. Specifically, we propose a novel framework SparTerm to directly learn sparse text representations in the full vocabulary space. The proposed SparTerm comprises an importance predictor to predict the importance for each term in the vocabulary, and a gating controller to control the term activation. These two modules cooperatively ensure the sparsity and flexibility of the final text representation, which unifies the term-weighting and expansion in the same framework. Evaluated on MSMARCO dataset, SparTerm significantly outperforms traditional sparse methods and achieves state of the art ranking performance among all the PLM-based sparse models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Understanding Wacky Weights: A Dissection of SPLADE's Learned Term Importance
cs.IR 2026-05 conditional novelty 6.0

SPLADE models produce wacky expansion terms whose prevalence rises with larger vocabularies and falls with stricter sparsity; these terms primarily aid in-domain retrieval rather than out-of-domain generalization.
The Role of Vocabularies in Learning Sparse Representations for Ranking
cs.IR 2025-09 unverdicted novelty 5.0

Larger 100K vocabularies in SPLADE models, especially those initialized with ESPLADE pretraining, improve retrieval effectiveness after pruning compared to 32K baselines while keeping similar efficiency.