Learning to Reformulate the Queries on the WEB
Pith reviewed 2026-05-25 11:02 UTC · model grok-4.3
The pith
An unsupervised neural encoder-decoder model trained on anchor phrases generates query reformulations that improve retrieval performance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an end-to-end unsupervised model consisting of a character-level convolutional neural network encoder with max-pooling and an attention-based recurrent neural network decoder, trained on anchor phrases, can produce effective reformulations of user queries that improve retrieval performance on test collections.
What carries the argument
The end-to-end encoder-decoder architecture with a character-level CNN encoder and attention-based RNN decoder, trained directly on anchor phrases as the unsupervised signal.
If this is right
- Reformulated queries improve retrieval performance on test collections.
- The model trains successfully without labeled data or extra supervision.
- Anchor phrases provide an effective signal for learning general query reformulations via sequence generation.
- Character-level processing in the encoder supports varied query inputs without word-level preprocessing.
Where Pith is reading between the lines
- The same training approach could be tested on other large text collections that contain anchor-like links.
- Generated reformulations might transfer to related tasks such as query suggestion or expansion in retrieval systems.
- Performance gains could compound if the model is fine-tuned on domain-specific anchor data.
Load-bearing premise
Anchor phrases from a large web corpus constitute a suitable unsupervised training signal for learning effective query reformulations without additional supervision or validation of signal quality.
What would settle it
If the generated reformulations produce no improvement in retrieval metrics such as mean average precision on held-out test collections compared with the original queries, the central claim would be falsified.
Figures
read the original abstract
Inability of the naive users to formulate appropriate queries is a fundamental problem in web search engines. Therefore, assisting users to issue more effective queries is an important way to improve users' happiness. One effective approach is query reformulation, which generates new effective queries according to the current query issued by users. Previous researches typically generate words and phrases related to the original query. Since the definition of query reformulation is quite general, it is completely difficult to develop a uniform term-based approach for this problem. This paper uses readily available data, particularly over one billion anchor phrases in Clueweb09 corpus, in order to learn an end-to-end encoder-decoder model to automatically generate effective queries. Following successful researches in the field of sequence to sequence models, we employ a character-level convolutional neural network with max-pooling at encoder and an attention-based recurrent neural network at decoder. The whole model learned in an unsupervised end-to-end manner.Experiments on TREC collections show that the reformulated queries automatically generated by the proposed solution can significantly improve the retrieval performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an unsupervised end-to-end sequence-to-sequence model for query reformulation that uses a character-level CNN encoder with max-pooling and an attention-based RNN decoder. The model is trained on over one billion anchor phrases extracted from the Clueweb09 corpus and is claimed to generate reformulations that significantly improve retrieval performance when evaluated on TREC collections.
Significance. If the results hold after proper validation, the work would provide a scalable unsupervised method for query reformulation that leverages readily available web anchor data rather than curated supervision, addressing a core IR challenge with potential for broad applicability.
major comments (2)
- [Abstract / Training section] Abstract and training data description: The model is trained directly on anchor phrases from Clueweb09 as targets with no validation, analysis, or auxiliary experiments demonstrating that these phrases constitute a suitable proxy for effective query reformulations (e.g., no measurement of correlation with user intent, noise levels, or downstream retrieval gains). This is load-bearing for the central claim of significant TREC improvements.
- [Experiments] Experiments section: The abstract asserts that reformulated queries 'significantly improve the retrieval performance' on TREC collections, yet no details are supplied on metrics (MAP, NDCG, etc.), baselines (original queries or prior reformulation methods), query counts, statistical significance tests, or error analysis. This prevents verification of whether the data support the claim.
minor comments (1)
- [Abstract] Abstract: The sentence 'The whole model learned in an unsupervised end-to-end manner.' is grammatically incomplete.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract / Training section] Abstract and training data description: The model is trained directly on anchor phrases from Clueweb09 as targets with no validation, analysis, or auxiliary experiments demonstrating that these phrases constitute a suitable proxy for effective query reformulations (e.g., no measurement of correlation with user intent, noise levels, or downstream retrieval gains). This is load-bearing for the central claim of significant TREC improvements.
Authors: We agree that additional validation of anchor phrases as proxies would strengthen the central claim. The manuscript relies on end-to-end TREC gains as implicit evidence, but we will add a new paragraph in the training data section with auxiliary analysis (e.g., overlap statistics with known reformulation patterns from prior IR literature and basic noise characterization of the Clueweb09 anchors) to directly address suitability. revision: yes
-
Referee: [Experiments] Experiments section: The abstract asserts that reformulated queries 'significantly improve the retrieval performance' on TREC collections, yet no details are supplied on metrics (MAP, NDCG, etc.), baselines (original queries or prior reformulation methods), query counts, statistical significance tests, or error analysis. This prevents verification of whether the data support the claim.
Authors: We acknowledge the abstract is underspecified. In revision we will expand the abstract to explicitly state the evaluation metrics (MAP), baselines (original queries plus prior reformulation methods), query counts from the TREC collections, and reference to statistical significance testing. We will also add a dedicated error analysis subsection to the experiments section. revision: yes
Circularity Check
No significant circularity; training and evaluation use disjoint external corpora
full rationale
The paper extracts >1B anchor phrases from Clueweb09 as unsupervised training targets for a char-CNN encoder + attention RNN decoder, then evaluates generated reformulations on separate TREC collections. No equations, self-citations, or fitted parameters reduce the reported retrieval gains to the training signal by construction. The model is learned end-to-end on external data and tested against independent benchmarks, satisfying the self-contained criterion.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Anchor phrases from Clueweb09 provide a suitable unsupervised signal for learning effective query reformulations
Reference graph
Works this paper leans on
-
[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate, In Inter- national Conference on Learning Representations. CoRR abs/1409.0473 (2015). h/t_tp://arxiv.org/abs/1409.0473
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[2]
Lidong Bing, Wai Lam, Tak-Lam Wong, and Shoaib Jameel. 2015. Web /Q_uery Reformulation via Joint Modeling of Latent Topic Dependency and Term Context. ACM Trans. Inf. Syst. 33, 2, Article 6 (Feb. 2015), 38 pages. DOI:h/t_tp://dx.doi.org/10.1145/2699666
-
[3]
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected Reciprocal Rank for Graded Relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM ’09). ACM, New York, NY, USA, 621–630. DOI:h/t_tp://dx.doi.org/10.1145/ 1645953.1646033 5h/t_tps://github.com/nyu-dl/dl4mt-c2c 6h/t_tp://boston.lti.cs....
-
[4]
Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder– Decoder Approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation . Association for Com- putational Linguistics, Doha, Qatar, 103–111. h/t_tp://www.aclweb.org/ a...
work page 2014
-
[5]
Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bah- danau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learn- ing Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) . Association for Com- putation...
work page 2014
-
[6]
Junyoung Chung, Kyunghyun Cho, and Yoshua Bengio. 2016. A Character- level Decoder without Explicit Segmentation for Neural Machine Transla- tion. CoRR abs/1603.06147 (2016). h/t_tp://arxiv.org/abs/1603.06147
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[7]
Clarke, Nick Craswell, and Ian Soboroff
Charles L.A. Clarke, Nick Craswell, and Ian Soboroff. 2009. Overview of the TREC 2009 Web Track . Technical Report. NIST. h/t_tp://trec.nist.gov/ pubs/trec18/papers/WEB09.OVERVIEW.pdf
work page 2009
-
[8]
Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Gordon V. Cormack
- [9]
-
[10]
Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Ellen M. Voorhees
- [11]
-
[12]
Charles L. A. Clarke, Nick Craswell, and Ellen M. Voorhees. 2012.Overview of the TREC 2012 Web Track . Technical Report. NIST
work page 2012
-
[13]
GordonV. Cormack, MarkD. Smucker, and CharlesL.A. Clarke. 2011. Effi- cient and effective spam /f_iltering and re-ranking for large web datasets. Information Retrieval 14, 5 (2011), 441–465. DOI:h/t_tp://dx.doi.org/10.1007/ s10791-011-9162-z
work page 2011
-
[14]
Nick Craswell, Bodo Billerbeck, Dennis Fe/t_terly, and Marc Najork. 2013. Robust /Q_uery Rewriting Using Anchor Data. InProceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM ’13). ACM, New York, NY, USA, 335–344. DOI:h/t_tp://dx.doi.org/10.1145/ 2433396.2433440
-
[15]
Bruce Cro/f_t, Jiafeng Guo, Bhaskar Mitra, and Maarten de Rijke
Nick Craswell, W. Bruce Cro/f_t, Jiafeng Guo, Bhaskar Mitra, and Maarten de Rijke. 2016. Neu-IR: /T_he SIGIR 2016 Workshop on Neural Informa- tion Retrieval. In SIGIR 2016: 39th international ACM SIGIR conference on Research and development in information retrieval . ACM, 1245–1246
work page 2016
-
[16]
2009.Search Engines: Information Retrieval in Practice (1st ed.)
Bruce Cro/f_t, Donald Metzler, and Trevor Strohman. 2009.Search Engines: Information Retrieval in Practice (1st ed.). Addison-Wesley Publishing Company, USA
work page 2009
-
[17]
Van Dang and Bruce W. Cro/f_t. 2010. /Q_uery Reformulation Using Anchor Text. In Proceedings of the /T_hird ACM International Conference on Web Search and Data Mining (WSDM ’10) . ACM, New York, NY, USA, 41–50. DOI:h/t_tp://dx.doi.org/10.1145/1718487.1718493
-
[18]
Nadav Eiron and Kevin S. McCurley. 2003. Analysis of Anchor Text for Web Search. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval (SIGIR ’03). ACM, New York, NY, USA, 459–460. DOI:h/t_tp://dx.doi.org/10.1145/ 860435.860550
-
[19]
Manish Gupta and Michael Bendersky. 2015. Information Retrieval with Verbose /Q_ueries. InProceedings of the 38th International ACM SIGIR Con- ference on Research and Development in Information Retrieval (SIGIR ’15) . ACM, New York, NY, USA, 1121–1124. DOI:h/t_tp://dx.doi.org/10.1145/ 2766462.2767877
-
[20]
Jeff Huang and E/f_thimis N. E/f_thimiadis. 2009. Analyzing and Evaluating /Q_uery Reformulation Strategies in Web Search Logs. InProceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM ’09). ACM, New York, NY, USA, 77–86. DOI:h/t_tp://dx.doi.org/10.1145/ 1645953.1645966
-
[21]
Neural Machine Transliteration: Preliminary Results
Amir H. Jadidinejad. 2016. Neural Machine Transliteration: Preliminary Results. CoRR abs/1609.04253 (2016). h/t_tp://arxiv.org/abs/1609.04253
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[22]
Jansen, Amanda Spink, and Te/f_ko Saracevic
Bernard J. Jansen, Amanda Spink, and Te/f_ko Saracevic. 2000. Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing & Management 36, 2 (2000), 207 – 227. DOI: h/t_tp://dx.doi.org/10.1016/S0306-4573(99)00056-4
-
[23]
Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Gener- ating /Q_uery Substitutions. InProceedings of the 15th International Confer- ence on World Wide Web (WWW ’06) . ACM, New York, NY, USA, 387–396. DOI:h/t_tp://dx.doi.org/10.1145/1135777.1135835
-
[24]
Neural Machine Translation in Linear Time
N. Kalchbrenner, L. Espeholt, K. Simonyan, A. van den Oord, A. Graves, and K. Kavukcuoglu. 2016. Neural Machine Translation in Linear Time. ArXiv e-prints (Oct. 2016). arXiv:cs.CL/1610.10099
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[25]
Kato, Tetsuya Sakai, and Katsumi Tanaka
Makoto P. Kato, Tetsuya Sakai, and Katsumi Tanaka. 2013. When do people use query suggestion? A query suggestion log analysis. Infor- mation Retrieval 16, 6 (2013), 725–746. DOI:h/t_tp://dx.doi.org/10.1007/ s10791-012-9216-x
work page 2013
-
[26]
Yoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush. 2015. Character-Aware Neural Language Models. CoRR abs/1508.06615 (2015). 7 h/t_tp://arxiv.org/abs/1508.06615
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[27]
Adam: A Method for Stochastic Optimization
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). h/t_tp://arxiv.org/abs/1412.6980
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[28]
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2015. Skip-/T_hought Vectors. CoRR abs/1506.06726 (2015). h/t_tp://arxiv.org/abs/1506.06726
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[29]
Reiner Kra/f_t and Jason Zien. 2004. Mining Anchor Text for /Q_uery Re- /f_inement. InProceedings of the 13th International Conference on World Wide Web (WWW ’04) . ACM, New York, NY, USA, 666–674. DOI:h/t_tp: //dx.doi.org/10.1145/988672.988763
-
[30]
J. Lee, K. Cho, and T. Hofmann. 2016. Fully Character-Level Neural Machine Translation without Explicit Segmentation. ArXiv e-prints (Oct. 2016). arXiv:cs.CL/1610.03017
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[31]
Bruce Cro/f_t, Michael Bendersky, Ziqi Wang, and Evelyne Viegas
Hang Li, Gu Xu, W. Bruce Cro/f_t, Michael Bendersky, Ziqi Wang, and Evelyne Viegas. 2012. QRU-1: A Public Dataset for Promoting /Q_uery Representation and Understanding Research. In Workshop on Web Search Click Data (WSCD’12)
work page 2012
-
[32]
/T_hang Luong, Ilya Sutskever, /Q_uoc Le, Oriol Vinyals, and Wojciech Zaremba. 2015. Addressing the Rare Word Problem in Neural Machine Translation. In Proceedings of the 53rd Annual Meeting of the Associa- tion for Computational Linguistics and the 7th International Joint Con- ference on Natural Language Processing (Volume 1: Long Papers) . As- sociation...
work page 2015
-
[33]
Donald Metzler and W.Bruce Cro/f_t. 2004. Combining the language model and inference network approaches to retrieval. Information Processing & Management 40, 5 (2004), 735 – 750. DOI:h/t_tp://dx.doi.org/10.1016/j.ipm. 2004.05.001 Bayesian Networks and Information Retrieval
-
[34]
Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and Rabab Ward. 2016. Deep Sentence Embedding Using Long Short-term Memory Networks: Analysis and Application to Information Retrieval. IEEE/ACM Trans. Audio, Speech and Lang. Proc. 24, 4 (April 2016), 694–707. h/t_tp://dl.acm.org/citation.cfm?id=2992449.2992457
-
[35]
Daniel Sheldon, Milad Shokouhi, Martin Szummer, and Nick Craswell
-
[36]
In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM ’11)
LambdaMerge: Merging the Results of /Q_uery Reformulations. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM ’11) . ACM, New York, NY, USA, 795–804. DOI: h/t_tp://dx.doi.org/10.1145/1935826.1935930
-
[37]
T. Strohman, D. Metzler, H. Turtle, and W.B. Cro/f_t. 2005. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligent Analysis
work page 2005
-
[38]
Ilya Sutskever, Oriol Vinyals, and /Q_uoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. CoRR abs/1409.3215 (2014). h/t_tp://arxiv. org/abs/1409.3215
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[39]
/T_heano Development Team. 2016. /T_heano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). h/t_tp://arxiv.org/abs/1605.02688
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[40]
Oriol Vinyals and /Q_uoc V. Le. 2015. A Neural Conversational Model.CoRR abs/1506.05869 (2015). h/t_tp://arxiv.org/abs/1506.05869
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[41]
Xuanhui Wang and ChengXiang Zhai. 2008. Mining Term Associa- tion Pa/t_terns from Search Logs for Effective /Q_uery Reformulation. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM ’08) . ACM, New York, NY, USA, 479–488. DOI: h/t_tp://dx.doi.org/10.1145/1458082.1458147
-
[42]
Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. Liu, L. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, and J. Dean. 2016. Google’s Neural Machine...
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[43]
Xiaobing Xue and W. Bruce Cro/f_t. 2013. Modeling Reformulation Using /Q_uery Distributions.ACM Trans. Inf. Syst. 31, 2, Article 6 (May 2013), 34 pages. DOI:h/t_tp://dx.doi.org/10.1145/2457465.2457466
-
[44]
Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level Con- volutional Networks for Text Classi/f_ication.CoRR abs/1509.01626 (2015). h/t_tp://arxiv.org/abs/1509.01626
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[45]
Neural Information Retrieval: A Literature Review
Ye Zhang, Md Musta/f_izur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, /Q_uinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An /T_hanh Nguyen, Dan Xu, Byron C. Wal- lace, and Ma/t_thew Lease. 2016. Neural Information Retrieval: A Literature Review. CoRR abs/1611.06792 (2016). h/t_tp://arxiv.org/abs/1611.06792 8
work page internal anchor Pith review Pith/arXiv arXiv 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.