STORM trains lexical query rewriters via reward-guided beam search that converts retrieval metrics into stepwise token signals, enabling 0.6B-8B models to rival dense retrievers on TREC, BEIR and MIRACL without index changes.
Bruce Croft
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
citing papers explorer
-
STORM: Stepwise Token Optimization with Reward-Guided Beam Search
STORM trains lexical query rewriters via reward-guided beam search that converts retrieval metrics into stepwise token signals, enabling 0.6B-8B models to rival dense retrievers on TREC, BEIR and MIRACL without index changes.
- Test-Time Compute for Frozen Embedding Models through Agentic Program Search