Recognition: unknown
Hypencoder Revisited: Reproducibility and Analysis of Non-Linear Scoring for First-Stage Retrieval
Pith reviewed 2026-05-07 11:12 UTC · model grok-4.3
The pith
Reproducing the Hypencoder confirms its non-linear q-net scorer beats standard bi-encoders on retrieval benchmarks while an efficient search algorithm cuts latency with little accuracy loss.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Hypencoder, which uses a hypernetwork to generate weights for a query-specific neural scoring network, reproduces to outperform a similarly trained bi-encoder baseline on in-domain and out-of-domain benchmarks. Its proposed efficient search algorithm reduces query latency with only minimal performance degradation. On hard tasks the advantage holds for DL-Hard and FollowIR but not TREC TOT, where checkpoint incompatibility and fine-tuning sensitivity prevent full verification. Performance gains when swapping pre-trained encoders depend on the encoder and fine-tuning choices; standard Faiss-based bi-encoder retrieval remains faster in both exhaustive and approximate settings; and the non-l
What carries the argument
The q-net, a query-specific neural network for relevance scoring whose weights are produced by a hypernetwork from contextualized query embeddings, enabling expressive non-linear scoring while keeping query and document encodings independent.
If this is right
- Hypencoder performance gains when integrating alternative pre-trained encoders depend on the specific encoder and the fine-tuning strategy used.
- Standard bi-encoder retrieval with Faiss indexing remains faster than the Hypencoder under both exhaustive and efficient search conditions.
- The q-net's non-linear scoring does not produce a consistent robustness disadvantage relative to inner-product scoring under adversarial evaluation.
- Partial support on hard tasks indicates that checkpoint compatibility and fine-tuning sensitivity affect whether the Hypencoder advantage appears on every difficult benchmark.
Where Pith is reading between the lines
- The observed sensitivity to checkpoints suggests that future neural retrieval papers should release exact training scripts and final model weights to enable tighter reproductions.
- If further latency optimizations close the gap with Faiss-based bi-encoders, the q-net approach could become practical for production first-stage retrieval where accuracy matters more than raw speed.
- The lack of a consistent adversarial robustness penalty opens the possibility that non-linear scoring can be added to other retrieval architectures without introducing new attack surfaces.
Load-bearing premise
The reproduction setup, including model checkpoints, training data order, and fine-tuning hyperparameters, matches the original Hypencoder implementation closely enough for direct performance comparison.
What would settle it
Re-training the Hypencoder from the same starting checkpoints and evaluating it on the same in-domain and out-of-domain benchmarks where it fails to exceed the bi-encoder baseline by a clear margin.
Figures
read the original abstract
The Hypencoder, proposed by Killingback et al., is a retrieval framework that replaces the fixed inner-product scoring function used in standard bi-encoders with a query-specific neural network (the $q$-net), whose weights are generated by a hypernetwork from the contextualized query embeddings. This design enables more expressive relevance estimation while preserving independent query and document encoding. In this work, we conduct a reproducibility study of the Hypencoder and extend the original analysis in three directions. Our reproduction confirms that the Hypencoder outperforms a similarly trained bi-encoder baseline on in-domain and out-of-domain benchmarks, and that the proposed efficient search algorithm substantially reduces query latency with minimal performance loss. On hard retrieval tasks, we find partial support: the Hypencoder outperforms the baseline on DL-Hard and FollowIR, but not on TREC TOT, where checkpoint incompatibility and fine-tuning sensitivity complicate full verification. Beyond reproduction, we investigate three extensions: (i)~integrating alternative pre-trained encoders into the Hypencoder framework, where we find that performance gains depend on the encoder and fine-tuning strategy; (ii)~comparing query latency against a Faiss-based bi-encoder pipeline, revealing that standard bi-encoder retrieval remains faster under both exhaustive and efficient search settings; and (iii)~evaluating adversarial robustness, where we find that the $q$-net's non-linear scoring does not provide a consistent robustness disadvantage over inner-product scoring. Our code is publicly available at https://github.com/arneeichholtz/Hypencoder-reprod.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports a reproducibility study of the Hypencoder framework, which replaces inner-product scoring in bi-encoders with a query-specific neural network (q-net) whose weights are generated by a hypernetwork. The authors confirm that the reproduced Hypencoder outperforms a similarly trained bi-encoder baseline on in-domain and out-of-domain benchmarks (with partial support on hard tasks), that the proposed efficient search algorithm reduces query latency with minimal performance loss, and extend the work by testing alternative encoders, comparing latency to Faiss-based pipelines (where standard bi-encoders remain faster), and evaluating adversarial robustness (no consistent disadvantage found). Public code is released at the provided GitHub link.
Significance. This work is significant for providing independent verification and extensions to the original Hypencoder claims. The public code, benchmark results, and direct empirical comparisons (with no circular derivations) are strengths that support community reuse. If the out-of-domain generalization holds under matched conditions, the findings indicate that non-linear q-net scoring can yield measurable gains over standard bi-encoders in first-stage retrieval.
major comments (2)
- [Hard retrieval tasks / out-of-domain benchmarks] Hard tasks results (DL-Hard, FollowIR, TREC TOT): checkpoint incompatibility on TREC TOT prevents matched comparison, which is load-bearing for the out-of-domain generalization claim in the abstract and results section. The paper should detail the exact mismatches in training data order, initialization, or hyperparameters and, if possible, provide an aligned run or sensitivity analysis to strengthen attribution of gains to the q-net architecture rather than setup differences.
- [Latency analysis / extension (ii)] Latency comparison to Faiss-based bi-encoder: the finding that standard bi-encoder retrieval remains faster under both exhaustive and efficient search settings qualifies the efficiency claims for the proposed Hypencoder search algorithm. This should be more explicitly framed in the discussion of latency reductions to avoid overstating practical advantages.
minor comments (2)
- Tables reporting benchmark results should explicitly distinguish reproduced numbers from original paper values and note any fine-tuning differences.
- Clarify the exact pre-trained encoder variants and fine-tuning strategies tested in extension (i) to make the dependence on encoder choice easier to interpret.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and constructive feedback. We address the major comments point by point below.
read point-by-point responses
-
Referee: [Hard retrieval tasks / out-of-domain benchmarks] Hard tasks results (DL-Hard, FollowIR, TREC TOT): checkpoint incompatibility on TREC TOT prevents matched comparison, which is load-bearing for the out-of-domain generalization claim in the abstract and results section. The paper should detail the exact mismatches in training data order, initialization, or hyperparameters and, if possible, provide an aligned run or sensitivity analysis to strengthen attribution of gains to the q-net architecture rather than setup differences.
Authors: We thank the referee for this observation. The manuscript already qualifies the TREC TOT results due to checkpoint incompatibility in the abstract and results section. In the revision, we have added a new paragraph in Section 4.3 explicitly detailing the mismatches in training data order, initialization seeds, and hyperparameter settings between our reproduction and the original Hypencoder checkpoints. We have also included a sensitivity analysis on the compatible DL-Hard and FollowIR runs to isolate the contribution of the q-net. However, the fundamental incompatibility of the TREC TOT checkpoints prevents an aligned run, so we have further emphasized the partial nature of the hard-task support and clarified that the primary out-of-domain claims rest on the matched benchmarks. revision: partial
-
Referee: [Latency analysis / extension (ii)] Latency comparison to Faiss-based bi-encoder: the finding that standard bi-encoder retrieval remains faster under both exhaustive and efficient search settings qualifies the efficiency claims for the proposed Hypencoder search algorithm. This should be more explicitly framed in the discussion of latency reductions to avoid overstating practical advantages.
Authors: We agree that the comparison should be framed more explicitly. In the revised discussion (Section 5.2), we now state upfront that although the proposed efficient search algorithm reduces Hypencoder query latency with only minimal performance loss, standard bi-encoder retrieval with Faiss remains faster under both exhaustive and approximate settings. This qualification is presented as a direct limitation on the practical efficiency gains of the Hypencoder approach. revision: yes
- Providing a fully aligned run on TREC TOT due to checkpoint incompatibility
Circularity Check
No circularity: empirical reproducibility study with no derivations or fitted predictions
full rationale
The paper conducts a reproducibility study of the existing Hypencoder model, performing direct empirical comparisons against baselines on public benchmarks (in-domain and out-of-domain). It reports performance metrics, latency measurements, and robustness evaluations without any mathematical derivations, first-principles predictions, or parameter-fitting steps that could reduce to self-definition or self-citation. Claims rest on experimental results and code release; the noted checkpoint incompatibility on TREC TOT is a transparency issue about reproduction fidelity, not a circular reduction in any derivation chain. No load-bearing steps match the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Jaime Arguello, Samarth Bhargav, Fernando Diaz, Evangelos Kanoulas, and Bhaskar Mitra. 2023. Overview of the TREC 2023 Tip-of-the-Tongue Track. InThe Thirty-Second Text REtrieval Conference Proceedings (TREC 2023), Gaithers- burg, MD, USA, November. 14–17
2023
-
[2]
Samarth Bhargav, Georgios Sidiropoulos, and Evangelos Kanoulas. 2022. ’It’s on the tip of my tongue’: A new Dataset for Known-Item Retrieval. InWSDM ’22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21 - 25, 2022, K. Selcuk Candan, Huan Liu, Leman Akoglu, Xin Luna Dong, and Jiliang Tang...
-
[3]
Alexander Bondarenko, Maik Fröbe, Meriem Beloucif, Lukas Gienapp, Yamen Ajjour, Alexander Panchenko, Chris Biemann, Benno Stein, Henning Wachsmuth, Martin Potthast, et al. 2020. Overview of Touché 2020: argument retrieval. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, 384–395
2020
-
[4]
Vera Boteva, Demian Gholipour Ghalandari, Artem Sokolov, and Stefan Riezler
-
[5]
A Full-Text Learning to Rank Dataset for Medical Information Retrieval. InAdvances in Information Retrieval - 38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20-23, 2016. Proceedings (Lecture Notes in Computer Science, Vol. 9626). Springer, 716–722. doi:10.1007/978-3-319-30671-1_58 Hypencoder Revisited: Reproducibility and Analysis...
-
[6]
Ritchie, and Nick Weston
Andrew Brock, Theodore Lim, James M. Ritchie, and Nick Weston. 2018. SMASH: One-Shot Model Architecture Search through HyperNetworks. In6th Inter- national Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rydeCEhs-
2018
-
[7]
Vinod Kumar Chauhan, Jiandong Zhou, Ping Lu, Soheila Molaei, and David A. Clifton. 2024. A brief review of hypernetworks in deep learning.Artif. Intell. Rev. 57, 9 (2024), 250. doi:10.1007/S10462-024-10862-8
-
[8]
Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, and Daniel S. Weld
-
[9]
SPECTER: Document-level Representation Learning using Citation- informed Transformers. InProceedings of the 58th Annual Meeting of the As- sociation for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 2270–2282. doi:10.18653/V...
- [10]
- [11]
-
[12]
Boyd-Graber, Jannis Bulian, Massimiliano Ciaramita, and Markus Leippold
Thomas Diggelmann, Jordan L. Boyd-Graber, Jannis Bulian, Massimiliano Cia- ramita, and Markus Leippold. 2020. CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims.CoRRabs/2012.00614 (2020). arXiv:2012.00614 https://arxiv.org/abs/2012.00614
-
[13]
Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2024. The Faiss library. (2024). arXiv:2401.08281 [cs.LG]
work page internal anchor Pith review arXiv 2024
-
[14]
Thibault Formal, Benjamin Piwowarski, and Stéphane Clinchant. 2021. SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking. InSIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021, Fernando Diaz, Chirag Shah, Torsten Suel, Pablo Castells, Rosie Jones...
2021
-
[15]
doi:10.1145/3404835.3463098
-
[16]
Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W Bruce Croft, and Xueqi Cheng. 2020. A deep look into neural ranking models for information retrieval.Information Processing & Management57, 6 (2020), 102067
2020
-
[17]
Dai, and Quoc V
David Ha, Andrew M. Dai, and Quoc V. Le. 2017. HyperNetworks. InInterna- tional Conference on Learning Representations. https://openreview.net/forum?id= rkpACe1lx
2017
-
[18]
Tim Hagen, Harrisen Scells, and Martin Potthast. 2024. Revisiting Query Variation Robustness of Transformer Models. InFindings of the Association for Computa- tional Linguistics: EMNLP 2024, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for Computational Linguistics, Miami, Florida, USA, 4283–4296. doi:10.18653/v1/2024.findings-emnlp.248
-
[19]
Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, and Jamie Callan. 2017. DBpedia-Entity v2: A Test Collection for Entity Search. InProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, August 7-11, 2017. ACM, 1265–1268. ...
-
[20]
Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. 2021. Efficiently teaching an effective dense retriever with balanced topic aware sampling. InProceedings of the 44th international ACM SIGIR confer- ence on research and development in information retrieval. 113–122
2021
-
[21]
Shankar Iyer, Nikhil Dandekar, and Kornél Csernai. 2017. First Quora Dataset Re- lease: Question Pairs. https://quoradata.quora.com/First-Quora-Dataset-Release- Question-Pairs
2017
-
[22]
Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bo- janowski, Armand Joulin, and Edouard Grave. 2022. Unsupervised Dense Infor- mation Retrieval with Contrastive Learning.Trans. Mach. Learn. Res.2022 (2022). https://openreview.net/forum?id=jKN1pXi7b0
2022
-
[23]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs.IEEE Transactions on Big Data7, 3 (2019), 535–547
2019
-
[24]
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open- Domain Question Answering. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Ya...
-
[25]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. InProceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020, Jimmy X. Huang, Yi Chang, Xueqi Cheng, Jaap Kamps, Vaness...
-
[26]
Julian Killingback, Hansi Zeng, and Hamed Zamani. 2025. Hypencoder: Hyper- networks for Information Retrieval. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2025, Padua, Italy, July 13-18, 2025, Nicola Ferro, Maria Maistro, Gabriella Pasi, Omar Alonso, Andrew Trotman, and Suzan Ver...
-
[27]
https://aclanthology.org/ Q19-1026/
Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur P. Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob De- vlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: a Benchmark for Question Answering Resear...
-
[28]
Yongkang Li. 2026. Understanding and Enhancing Robustness in Dense Informa- tion Retrieval. InAdvances in Information Retrieval - 48th European Conference on Information Retrieval, ECIR 2026, Delft, The Netherlands, March 29 - April 2, 2026, Proceedings, Part III (Lecture Notes in Computer Science). Springer, 599–607. doi:10.1007/978-3-032-21324-2_51
-
[29]
Yongkang Li, Panagiotis Eustratiadis, and Evangelos Kanoulas. 2025. Reproduc- ing HotFlip for Corpus Poisoning Attacks in Dense Retrieval. InAdvances in Information Retrieval - 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6-10, 2025, Proceedings, Part IV (Lecture Notes in Computer Science, Vol. 15575). Springer, 95–111...
-
[30]
Yongkang Li, Panagiotis Eustratiadis, Simon Lupart, and Evangelos Kanoulas
-
[31]
Unsupervised Corpus Poisoning Attacks in Continuous Space for Dense Re- trieval. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2025, Padua, Italy, July 13-18, 2025, Nicola Ferro, Maria Maistro, Gabriella Pasi, Omar Alonso, Andrew Trotman, and Suzan Verberne (Eds.). ACM, 2452–2462. ...
- [32]
-
[33]
Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, and Xilun Chen. 2023. How to Train Your Dragon: Diverse Augmentation Towards Generalizable Dense Retrieval. InFindings of the Associa- tion for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.)....
-
[34]
Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2021. In-Batch Nega- tives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Re- trieval. InProceedings of the 6th Workshop on Representation Learning for NLP, RepL4NLP@ACL-IJCNLP 2021, Online, August 6, 2021, Anna Rogers, Iacer Calixto, Ivan Vulic, Naomi Saphra, Nora Kassner, Oana-Maria Ca...
-
[35]
Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, and Jimmy Lin. 2024. Fine- Tuning LLaMA for Multi-Stage Text Retrieval. InProceedings of the 47th Inter- national ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024, Washington DC, USA, July 14-18, 2024, Grace Hui Yang, Hongning Wang, Sam Han, Claudia Hauff, Guido Zuccon, and ...
-
[36]
Iain Mackie, Jeffrey Dalton, and Andrew Yates. 2021. How Deep is your Learn- ing: the DL-HARD Annotated Deep Learning Dataset. InSIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Informa- tion Retrieval, Virtual Event, Canada, July 11-15, 2021, Fernando Diaz, Chirag Shah, Torsten Suel, Pablo Castells, Rosie Jones, and T...
-
[37]
Macedo Maia, Siegfried Handschuh, André Freitas, Brian Davis, Ross McDermott, Manel Zarrouk, and Alexandra Balahur. 2018. Www’18 open challenge: financial opinion mining and question answering. InCompanion proceedings of the the web conference 2018. 1941–1942
2018
-
[38]
Morris, Volodymyr Kuleshov, Vitaly Shmatikov, and Alexander M
John X. Morris, Volodymyr Kuleshov, Vitaly Shmatikov, and Alexander M. Rush
-
[39]
Text Embeddings Reveal (Almost) As Much As Text. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 12448–12460. doi:10.18653/V1/ 2023.EMNLP-MAIN.765
-
[40]
Aviv Navon, Aviv Shamsian, Ethan Fetaya, and Gal Chechik. 2021. Learning the Pareto Front with Hypernetworks. In9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=NjF772F4ZZR
2021
-
[41]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. Ms marco: A human-generated machine reading comprehension dataset. (2016)
2016
-
[42]
Rodrigo Nogueira and Kyunghyun Cho. 2020. Passage Re-ranking with BERT. arXiv:1901.04085 [cs.IR] https://arxiv.org/abs/1901.04085
work page internal anchor Pith review arXiv 2020
-
[43]
Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Doc- ument Ranking with a Pretrained Sequence-to-Sequence Model. InFindings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16- 20 November 2020 (Findings of ACL, Vol. EMNLP 2020), Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Lingu...
- [44]
-
[45]
Gustavo Penha, Arthur Câmara, and Claudia Hauff. 2022. Evaluating the Ro- bustness of Retrieval Pipelines with Query Variation Generators. InAdvances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10-14, 2022, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 13185), Matthias Hagen, Suzan...
-
[46]
Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, Ian Soboroff, Ellen Voorhees, Lucy Lu Wang, and William R Hersh. 2021. Searching for scientific evidence in a pandemic: An overview of TREC-COVID.Journal of Biomedical Informatics121 (2021), 103865
2021
-
[47]
Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford
Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. 1994. Okapi at TREC-3. InProceedings of The Third Text REtrieval Conference, TREC 1994, Gaithersburg, Maryland, USA, November 2-4, 1994 (NIST Special Publication, Vol. 500-225), Donna K. Harman (Ed.). National Institute of Standards and Technology (NIST), 109–12...
1994
-
[48]
Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. 2022. ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction. InProceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, W A, United States...
-
[49]
Text Embeddings Reveal (Almost) As Much As Text
Dominykas Seputis, Yongkang Li, Karsten Langerak, and Serghei Mihailov. 2025. Rethinking the Privacy of Text Embeddings: A Reproducibility Study of "Text Embeddings Reveal (Almost) As Much As Text". InProceedings of the Nineteenth ACM Conference on Recommender Systems, RecSys 2025, Prague, Czech Republic, September 22-26, 2025, Mária Bieliková, Pavel Kord...
-
[50]
Aviv Shamsian, Aviv Navon, Ethan Fetaya, and Gal Chechik. 2021. Personalized Federated Learning using Hypernetworks. InProceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event (Pro- ceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 9489–9502. http://proceeding...
2021
-
[51]
Jinyan Su, Preslav Nakov, and Claire Cardie. 2025. Corpus Poisoning via Approx- imate Greedy Gradient Descent. InFindings of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1, 2025, Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar (Eds.). Asso- ciation for Computational Linguistics, 427...
2025
-
[52]
Weiwei Sun, Lingyong Yan, Xinyu Ma, Shuaiqiang Wang, Pengjie Ren, Zhumin Chen, Dawei Yin, and Zhaochun Ren. 2023. Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino...
-
[53]
Panuthep Tasawong, Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, and Sarana Nutanong. 2023. Typo-Robust Representation Learning for Dense Retrieval. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki ...
2023
-
[54]
Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, and Iryna Gurevych. 2021. Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models.arXiv preprint arXiv:2104.08663(2021)
work page internal anchor Pith review arXiv 2021
-
[55]
James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal
-
[56]
FEVER: a Large-scale Dataset for Fact Extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Associa- tion for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), Mari- lyn A. Walker, Heng Ji, and Amanda Stent (Eds.). A...
work page internal anchor Pith review doi:10.18653/v1/n18-1074 2018
-
[57]
Johannes von Oswald, Christian Henning, João Sacramento, and Benjamin F. Grewe. 2020. Continual learning with hypernetworks. In8th International Con- ference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30,
2020
-
[58]
https://openreview.net/forum?id=SJgwNerKvB
OpenReview.net. https://openreview.net/forum?id=SJgwNerKvB
-
[59]
Henning Wachsmuth, Shahbaz Syed, and Benno Stein. 2018. Retrieval of the Best Counterargument without Prior Topic Knowledge. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computationa...
-
[60]
David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, and Hannaneh Hajishirzi. 2020. Fact or Fiction: Verifying Scientific Claims. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). As...
-
[61]
Lidan Wang, Jimmy Lin, and Donald Metzler. 2011. A cascade ranking model for efficient ranked retrieval. InProceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July 25-29, 2011, Wei-Ying Ma, Jian-Yun Nie, Ricardo Baeza-Yates, Tat-Seng Chua, and W. Bruce Croft (Eds.). AC...
-
[62]
Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2023. SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, Ann...
-
[63]
Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. 2024. Improving Text Embeddings with Large Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds...
- [64]
-
[65]
F ollow IR : Evaluating and teaching information retrieval models to follow instructions
Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Ben- jamin Van Durme, Dawn J. Lawrie, and Luca Soldaini. 2025. FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions. InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Associa- tion for Computational Linguistics: Human Langu...
-
[66]
Lawrie, Ashwin Paranjape, Yuhao Zhang, and Jack Hessel
Orion Weller, Benjamin Van Durme, Dawn J. Lawrie, Ashwin Paranjape, Yuhao Zhang, and Jack Hessel. 2025. Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net. https://openreview.net/forum?id=odvSjn416y
2025
- [67]
-
[68]
Bennett, Junaid Ahmed, and Arnold Overwijk
Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwijk. 2021. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/foru...
2021
-
[69]
Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Ellen Riloff, D...
-
[70]
Chris Zhang, Mengye Ren, and Raquel Urtasun. 2019. Graph HyperNetworks for Neural Architecture Search. In7th International Conference on Learning Rep- resentations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. https://openreview.net/forum?id=rkgW0oA9FX
2019
-
[71]
Zexuan Zhong, Ziqing Huang, Alexander Wettig, and Danqi Chen. 2023. Poi- soning Retrieval Corpora by Injecting Adversarial Passages. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistic...
-
[72]
Shengyao Zhuang and Guido Zuccon. 2021. Dealing with Typos for BERT- based Passage Retrieval and Ranking. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.)...
-
[73]
Shengyao Zhuang and Guido Zuccon. 2022. CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos(SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 1444–1454. doi:10.1145/3477495.3531951
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.