pith. sign in

arxiv: 1907.01300 · v1 · pith:U3VEFKRUnew · submitted 2019-07-02 · 💻 cs.IR

Learning to Reformulate the Queries on the WEB

Pith reviewed 2026-05-25 11:02 UTC · model grok-4.3

classification 💻 cs.IR
keywords query reformulationencoder-decoder modelunsupervised learninganchor phrasesinformation retrievalneural networksweb searchsequence generation
0
0 comments X

The pith

An unsupervised neural encoder-decoder model trained on anchor phrases generates query reformulations that improve retrieval performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to help users who issue ineffective queries to search engines by automatically producing better versions of those queries. It trains a character-level convolutional neural network encoder with max-pooling and an attention-based recurrent decoder end-to-end on a large set of anchor phrases, using no labeled supervision. The model treats query reformulation as a sequence generation task and learns directly from the available web data. Experiments indicate that the output queries yield better retrieval results on standard test collections. A sympathetic reader would care because the method offers a scalable way to improve search without manual annotation effort.

Core claim

The central claim is that an end-to-end unsupervised model consisting of a character-level convolutional neural network encoder with max-pooling and an attention-based recurrent neural network decoder, trained on anchor phrases, can produce effective reformulations of user queries that improve retrieval performance on test collections.

What carries the argument

The end-to-end encoder-decoder architecture with a character-level CNN encoder and attention-based RNN decoder, trained directly on anchor phrases as the unsupervised signal.

If this is right

  • Reformulated queries improve retrieval performance on test collections.
  • The model trains successfully without labeled data or extra supervision.
  • Anchor phrases provide an effective signal for learning general query reformulations via sequence generation.
  • Character-level processing in the encoder supports varied query inputs without word-level preprocessing.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same training approach could be tested on other large text collections that contain anchor-like links.
  • Generated reformulations might transfer to related tasks such as query suggestion or expansion in retrieval systems.
  • Performance gains could compound if the model is fine-tuned on domain-specific anchor data.

Load-bearing premise

Anchor phrases from a large web corpus constitute a suitable unsupervised training signal for learning effective query reformulations without additional supervision or validation of signal quality.

What would settle it

If the generated reformulations produce no improvement in retrieval metrics such as mean average precision on held-out test collections compared with the original queries, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 1907.01300 by Amir H. Jadidinejad.

Figure 1
Figure 1. Figure 1: A session has been de€ned in the Clueweb09 [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Validation loss vs. iterations during the learn [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Performance comparison among the original queries in TREC 2010-2012 query sets [8–10] and their top- [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
read the original abstract

Inability of the naive users to formulate appropriate queries is a fundamental problem in web search engines. Therefore, assisting users to issue more effective queries is an important way to improve users' happiness. One effective approach is query reformulation, which generates new effective queries according to the current query issued by users. Previous researches typically generate words and phrases related to the original query. Since the definition of query reformulation is quite general, it is completely difficult to develop a uniform term-based approach for this problem. This paper uses readily available data, particularly over one billion anchor phrases in Clueweb09 corpus, in order to learn an end-to-end encoder-decoder model to automatically generate effective queries. Following successful researches in the field of sequence to sequence models, we employ a character-level convolutional neural network with max-pooling at encoder and an attention-based recurrent neural network at decoder. The whole model learned in an unsupervised end-to-end manner.Experiments on TREC collections show that the reformulated queries automatically generated by the proposed solution can significantly improve the retrieval performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes an unsupervised end-to-end sequence-to-sequence model for query reformulation that uses a character-level CNN encoder with max-pooling and an attention-based RNN decoder. The model is trained on over one billion anchor phrases extracted from the Clueweb09 corpus and is claimed to generate reformulations that significantly improve retrieval performance when evaluated on TREC collections.

Significance. If the results hold after proper validation, the work would provide a scalable unsupervised method for query reformulation that leverages readily available web anchor data rather than curated supervision, addressing a core IR challenge with potential for broad applicability.

major comments (2)
  1. [Abstract / Training section] Abstract and training data description: The model is trained directly on anchor phrases from Clueweb09 as targets with no validation, analysis, or auxiliary experiments demonstrating that these phrases constitute a suitable proxy for effective query reformulations (e.g., no measurement of correlation with user intent, noise levels, or downstream retrieval gains). This is load-bearing for the central claim of significant TREC improvements.
  2. [Experiments] Experiments section: The abstract asserts that reformulated queries 'significantly improve the retrieval performance' on TREC collections, yet no details are supplied on metrics (MAP, NDCG, etc.), baselines (original queries or prior reformulation methods), query counts, statistical significance tests, or error analysis. This prevents verification of whether the data support the claim.
minor comments (1)
  1. [Abstract] Abstract: The sentence 'The whole model learned in an unsupervised end-to-end manner.' is grammatically incomplete.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract / Training section] Abstract and training data description: The model is trained directly on anchor phrases from Clueweb09 as targets with no validation, analysis, or auxiliary experiments demonstrating that these phrases constitute a suitable proxy for effective query reformulations (e.g., no measurement of correlation with user intent, noise levels, or downstream retrieval gains). This is load-bearing for the central claim of significant TREC improvements.

    Authors: We agree that additional validation of anchor phrases as proxies would strengthen the central claim. The manuscript relies on end-to-end TREC gains as implicit evidence, but we will add a new paragraph in the training data section with auxiliary analysis (e.g., overlap statistics with known reformulation patterns from prior IR literature and basic noise characterization of the Clueweb09 anchors) to directly address suitability. revision: yes

  2. Referee: [Experiments] Experiments section: The abstract asserts that reformulated queries 'significantly improve the retrieval performance' on TREC collections, yet no details are supplied on metrics (MAP, NDCG, etc.), baselines (original queries or prior reformulation methods), query counts, statistical significance tests, or error analysis. This prevents verification of whether the data support the claim.

    Authors: We acknowledge the abstract is underspecified. In revision we will expand the abstract to explicitly state the evaluation metrics (MAP), baselines (original queries plus prior reformulation methods), query counts from the TREC collections, and reference to statistical significance testing. We will also add a dedicated error analysis subsection to the experiments section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; training and evaluation use disjoint external corpora

full rationale

The paper extracts >1B anchor phrases from Clueweb09 as unsupervised training targets for a char-CNN encoder + attention RNN decoder, then evaluates generated reformulations on separate TREC collections. No equations, self-citations, or fitted parameters reduce the reported retrieval gains to the training signal by construction. The model is learned end-to-end on external data and tested against independent benchmarks, satisfying the self-contained criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that anchor phrases serve as effective training examples for query reformulation and on standard neural sequence modeling assumptions; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Anchor phrases from Clueweb09 provide a suitable unsupervised signal for learning effective query reformulations
    The paper states it uses these phrases to train the end-to-end model without further labeled supervision.

pith-pipeline@v0.9.0 · 5702 in / 1155 out tokens · 24350 ms · 2026-05-25T11:02:44.241534+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 14 internal anchors

  1. [1]

    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate, In Inter- national Conference on Learning Representations. CoRR abs/1409.0473 (2015). h/t_tp://arxiv.org/abs/1409.0473

  2. [2]

    Lidong Bing, Wai Lam, Tak-Lam Wong, and Shoaib Jameel. 2015. Web /Q_uery Reformulation via Joint Modeling of Latent Topic Dependency and Term Context. ACM Trans. Inf. Syst. 33, 2, Article 6 (Feb. 2015), 38 pages. DOI:h/t_tp://dx.doi.org/10.1145/2699666

  3. [3]

    Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected Reciprocal Rank for Graded Relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM ’09). ACM, New York, NY, USA, 621–630. DOI:h/t_tp://dx.doi.org/10.1145/ 1645953.1646033 5h/t_tps://github.com/nyu-dl/dl4mt-c2c 6h/t_tp://boston.lti.cs....

  4. [4]

    Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder– Decoder Approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation . Association for Com- putational Linguistics, Doha, Qatar, 103–111. h/t_tp://www.aclweb.org/ a...

  5. [5]

    Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bah- danau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learn- ing Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) . Association for Com- putation...

  6. [6]

    Junyoung Chung, Kyunghyun Cho, and Yoshua Bengio. 2016. A Character- level Decoder without Explicit Segmentation for Neural Machine Transla- tion. CoRR abs/1603.06147 (2016). h/t_tp://arxiv.org/abs/1603.06147

  7. [7]

    Clarke, Nick Craswell, and Ian Soboroff

    Charles L.A. Clarke, Nick Craswell, and Ian Soboroff. 2009. Overview of the TREC 2009 Web Track . Technical Report. NIST. h/t_tp://trec.nist.gov/ pubs/trec18/papers/WEB09.OVERVIEW.pdf

  8. [8]

    Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Gordon V. Cormack

  9. [9]

    Technical Report

    Overview of the TREC 2010 Web Track . Technical Report. NIST

  10. [10]

    Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Ellen M. Voorhees

  11. [11]

    Technical Report

    Overview of the TREC 2011 Web Track . Technical Report. NIST

  12. [12]

    Charles L. A. Clarke, Nick Craswell, and Ellen M. Voorhees. 2012.Overview of the TREC 2012 Web Track . Technical Report. NIST

  13. [13]

    Cormack, MarkD

    GordonV. Cormack, MarkD. Smucker, and CharlesL.A. Clarke. 2011. Effi- cient and effective spam /f_iltering and re-ranking for large web datasets. Information Retrieval 14, 5 (2011), 441–465. DOI:h/t_tp://dx.doi.org/10.1007/ s10791-011-9162-z

  14. [14]

    Nick Craswell, Bodo Billerbeck, Dennis Fe/t_terly, and Marc Najork. 2013. Robust /Q_uery Rewriting Using Anchor Data. InProceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM ’13). ACM, New York, NY, USA, 335–344. DOI:h/t_tp://dx.doi.org/10.1145/ 2433396.2433440

  15. [15]

    Bruce Cro/f_t, Jiafeng Guo, Bhaskar Mitra, and Maarten de Rijke

    Nick Craswell, W. Bruce Cro/f_t, Jiafeng Guo, Bhaskar Mitra, and Maarten de Rijke. 2016. Neu-IR: /T_he SIGIR 2016 Workshop on Neural Informa- tion Retrieval. In SIGIR 2016: 39th international ACM SIGIR conference on Research and development in information retrieval . ACM, 1245–1246

  16. [16]

    2009.Search Engines: Information Retrieval in Practice (1st ed.)

    Bruce Cro/f_t, Donald Metzler, and Trevor Strohman. 2009.Search Engines: Information Retrieval in Practice (1st ed.). Addison-Wesley Publishing Company, USA

  17. [17]

    Van Dang and Bruce W. Cro/f_t. 2010. /Q_uery Reformulation Using Anchor Text. In Proceedings of the /T_hird ACM International Conference on Web Search and Data Mining (WSDM ’10) . ACM, New York, NY, USA, 41–50. DOI:h/t_tp://dx.doi.org/10.1145/1718487.1718493

  18. [18]

    McCurley

    Nadav Eiron and Kevin S. McCurley. 2003. Analysis of Anchor Text for Web Search. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval (SIGIR ’03). ACM, New York, NY, USA, 459–460. DOI:h/t_tp://dx.doi.org/10.1145/ 860435.860550

  19. [19]

    Manish Gupta and Michael Bendersky. 2015. Information Retrieval with Verbose /Q_ueries. InProceedings of the 38th International ACM SIGIR Con- ference on Research and Development in Information Retrieval (SIGIR ’15) . ACM, New York, NY, USA, 1121–1124. DOI:h/t_tp://dx.doi.org/10.1145/ 2766462.2767877

  20. [20]

    E/f_thimiadis

    Jeff Huang and E/f_thimis N. E/f_thimiadis. 2009. Analyzing and Evaluating /Q_uery Reformulation Strategies in Web Search Logs. InProceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM ’09). ACM, New York, NY, USA, 77–86. DOI:h/t_tp://dx.doi.org/10.1145/ 1645953.1645966

  21. [21]

    Neural Machine Transliteration: Preliminary Results

    Amir H. Jadidinejad. 2016. Neural Machine Transliteration: Preliminary Results. CoRR abs/1609.04253 (2016). h/t_tp://arxiv.org/abs/1609.04253

  22. [22]

    Jansen, Amanda Spink, and Te/f_ko Saracevic

    Bernard J. Jansen, Amanda Spink, and Te/f_ko Saracevic. 2000. Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing & Management 36, 2 (2000), 207 – 227. DOI: h/t_tp://dx.doi.org/10.1016/S0306-4573(99)00056-4

  23. [23]

    Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Gener- ating /Q_uery Substitutions. InProceedings of the 15th International Confer- ence on World Wide Web (WWW ’06) . ACM, New York, NY, USA, 387–396. DOI:h/t_tp://dx.doi.org/10.1145/1135777.1135835

  24. [24]

    Neural Machine Translation in Linear Time

    N. Kalchbrenner, L. Espeholt, K. Simonyan, A. van den Oord, A. Graves, and K. Kavukcuoglu. 2016. Neural Machine Translation in Linear Time. ArXiv e-prints (Oct. 2016). arXiv:cs.CL/1610.10099

  25. [25]

    Kato, Tetsuya Sakai, and Katsumi Tanaka

    Makoto P. Kato, Tetsuya Sakai, and Katsumi Tanaka. 2013. When do people use query suggestion? A query suggestion log analysis. Infor- mation Retrieval 16, 6 (2013), 725–746. DOI:h/t_tp://dx.doi.org/10.1007/ s10791-012-9216-x

  26. [26]

    Yoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush. 2015. Character-Aware Neural Language Models. CoRR abs/1508.06615 (2015). 7 h/t_tp://arxiv.org/abs/1508.06615

  27. [27]

    Adam: A Method for Stochastic Optimization

    Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). h/t_tp://arxiv.org/abs/1412.6980

  28. [28]

    Skip-Thought Vectors

    Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2015. Skip-/T_hought Vectors. CoRR abs/1506.06726 (2015). h/t_tp://arxiv.org/abs/1506.06726

  29. [29]

    Reiner Kra/f_t and Jason Zien. 2004. Mining Anchor Text for /Q_uery Re- /f_inement. InProceedings of the 13th International Conference on World Wide Web (WWW ’04) . ACM, New York, NY, USA, 666–674. DOI:h/t_tp: //dx.doi.org/10.1145/988672.988763

  30. [30]

    J. Lee, K. Cho, and T. Hofmann. 2016. Fully Character-Level Neural Machine Translation without Explicit Segmentation. ArXiv e-prints (Oct. 2016). arXiv:cs.CL/1610.03017

  31. [31]

    Bruce Cro/f_t, Michael Bendersky, Ziqi Wang, and Evelyne Viegas

    Hang Li, Gu Xu, W. Bruce Cro/f_t, Michael Bendersky, Ziqi Wang, and Evelyne Viegas. 2012. QRU-1: A Public Dataset for Promoting /Q_uery Representation and Understanding Research. In Workshop on Web Search Click Data (WSCD’12)

  32. [32]

    /T_hang Luong, Ilya Sutskever, /Q_uoc Le, Oriol Vinyals, and Wojciech Zaremba. 2015. Addressing the Rare Word Problem in Neural Machine Translation. In Proceedings of the 53rd Annual Meeting of the Associa- tion for Computational Linguistics and the 7th International Joint Con- ference on Natural Language Processing (Volume 1: Long Papers) . As- sociation...

  33. [33]

    Donald Metzler and W.Bruce Cro/f_t. 2004. Combining the language model and inference network approaches to retrieval. Information Processing & Management 40, 5 (2004), 735 – 750. DOI:h/t_tp://dx.doi.org/10.1016/j.ipm. 2004.05.001 Bayesian Networks and Information Retrieval

  34. [34]

    Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and Rabab Ward. 2016. Deep Sentence Embedding Using Long Short-term Memory Networks: Analysis and Application to Information Retrieval. IEEE/ACM Trans. Audio, Speech and Lang. Proc. 24, 4 (April 2016), 694–707. h/t_tp://dl.acm.org/citation.cfm?id=2992449.2992457

  35. [35]

    Daniel Sheldon, Milad Shokouhi, Martin Szummer, and Nick Craswell

  36. [36]

    In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM ’11)

    LambdaMerge: Merging the Results of /Q_uery Reformulations. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM ’11) . ACM, New York, NY, USA, 795–804. DOI: h/t_tp://dx.doi.org/10.1145/1935826.1935930

  37. [37]

    Strohman, D

    T. Strohman, D. Metzler, H. Turtle, and W.B. Cro/f_t. 2005. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligent Analysis

  38. [38]

    Ilya Sutskever, Oriol Vinyals, and /Q_uoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. CoRR abs/1409.3215 (2014). h/t_tp://arxiv. org/abs/1409.3215

  39. [39]

    /T_heano Development Team. 2016. /T_heano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). h/t_tp://arxiv.org/abs/1605.02688

  40. [40]

    Oriol Vinyals and /Q_uoc V. Le. 2015. A Neural Conversational Model.CoRR abs/1506.05869 (2015). h/t_tp://arxiv.org/abs/1506.05869

  41. [41]

    Xuanhui Wang and ChengXiang Zhai. 2008. Mining Term Associa- tion Pa/t_terns from Search Logs for Effective /Q_uery Reformulation. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM ’08) . ACM, New York, NY, USA, 479–488. DOI: h/t_tp://dx.doi.org/10.1145/1458082.1458147

  42. [42]

    Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. Liu, L. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, and J. Dean. 2016. Google’s Neural Machine...

  43. [43]

    Bruce Cro/f_t

    Xiaobing Xue and W. Bruce Cro/f_t. 2013. Modeling Reformulation Using /Q_uery Distributions.ACM Trans. Inf. Syst. 31, 2, Article 6 (May 2013), 34 pages. DOI:h/t_tp://dx.doi.org/10.1145/2457465.2457466

  44. [44]

    Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level Con- volutional Networks for Text Classi/f_ication.CoRR abs/1509.01626 (2015). h/t_tp://arxiv.org/abs/1509.01626

  45. [45]

    Neural Information Retrieval: A Literature Review

    Ye Zhang, Md Musta/f_izur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, /Q_uinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An /T_hanh Nguyen, Dan Xu, Byron C. Wal- lace, and Ma/t_thew Lease. 2016. Neural Information Retrieval: A Literature Review. CoRR abs/1611.06792 (2016). h/t_tp://arxiv.org/abs/1611.06792 8