arxiv: 2604.23388 · v1 · submitted 2026-04-25 · 💻 cs.IR · cs.AI· cs.CL· cs.LG

Recognition: unknown

A Parametric Memory Head for Continual Generative Retrieval

Kidist Amde Mekonnen , Yubao Tang , Maarten de Rijke

Authors on Pith no claims yet

Pith reviewed 2026-05-08 07:13 UTC · model grok-4.3

classification 💻 cs.IR cs.AIcs.CLcs.LG

keywords generative retrievalcontinual learningcatastrophic forgettingparametric memoryproduct-key memoryinformation retrievalsequential adaptationmemory tuning

0 comments

The pith

A parametric memory head attached after model adaptation lets generative retrieval systems incorporate new documents while retaining performance on earlier ones by updating only sparse memory entries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Generative retrieval models encode document knowledge directly in their parameters, so standard fine-tuning on new data batches causes substantial forgetting of previously seen documents. The paper introduces post-adaptation memory tuning that freezes the backbone and output matrix, then attaches a product-key memory module whose values are queried sparsely by decoder states during constrained decoding. Residual corrections from this memory adjust token scores only for valid document identifiers, with updates restricted to a fixed budget of entries chosen by their activation frequency on the new slice and rarity on prior ones. A reader would care because this targets the stability-plasticity dilemma in model-as-index retrieval systems that must handle growing, dynamic collections without full retraining.

Core claim

Attaching a product-key memory with fixed addressing to a frozen generative retrieval backbone allows decoder hidden states to produce sparse residual corrections during prefix-trie decoding; these corrections are projected through the frozen output embedding matrix to adjust scores for trie-valid tokens, while only a budget of memory values selected by current-slice access statistics and prior-session rarity are updated to limit cross-slice interference.

What carries the argument

The parametric memory head, a product-key memory with fixed addressing that receives sparse queries from decoder hidden states to generate hidden-space residual corrections mapped to output score adjustments via the frozen embedding matrix.

Load-bearing premise

That freezing the backbone and output embedding while updating only a fixed budget of memory values chosen by decoding-time access statistics is enough to block interference between successive disjoint document slices.

What would settle it

Running the sequential disjoint-slice experiments on MS MARCO or Natural Questions and finding that accuracy on the earliest slice drops sharply after three or more additions even when PAMT is applied with the stated memory budget.

Figures

Figures reproduced from arXiv: 2604.23388 by Kidist Amde Mekonnen, Maarten de Rijke, Yubao Tang.

**Figure 1.** Figure 1: Post-adaptation memory tuning (PAMT) for continual GenIR. (a) Adapt-then-stabilize pipeline: The parametric view at source ↗

**Figure 2.** Figure 2: Stage 1 vs. Stage 2 across temporal slices. Hit@10 (%) view at source ↗

**Figure 3.** Figure 3: Stage 1 vs. Stage 2 aggregate continual-learning metrics. AP, view at source ↗

read the original abstract

Generative information retrieval (GenIR) consolidates retrieval into a single neural model that decodes document identifiers (docids) directly from queries. While this model-as-index paradigm offers architectural simplicity, it is poorly suited to dynamic document collections. Unlike modular systems, where indexes are easily updated, GenIR's knowledge is parametrically encoded in its weights; consequently, standard adaptation methods such as full and parameter-efficient fine-tuning can induce catastrophic forgetting. We show that sequential adaptation improves retrieval on newly added documents but substantially degrades performance on earlier slices, exposing a pronounced stability-plasticity trade-off. To address this, we propose post-adaptation memory tuning (PAMT), a memory-only stabilization stage that augments an adapted model with a modular parametric memory head (PMH). PAMT freezes the backbone and attaches a product-key memory with fixed addressing. During prefix-trie constrained decoding, decoder hidden states sparsely query PMH to produce residual corrections in hidden space; these corrections are mapped to score adjustments via the frozen output embedding matrix, computed only over trie-valid tokens. This guides docid generation while keeping routing and backbone parameters fixed. To limit cross-slice interference, PAMT updates only a fixed budget of memory values selected using decoding-time access statistics, prioritizing entries frequently activated by the current slice and rarely used in prior sessions. Experiments on MS MARCO and Natural Questions under sequential, disjoint corpus increments show that PAMT substantially improves retention on earlier slices with minimal impact on retrieval performance for newly added documents, while modifying only a sparse subset of memory values per session.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PAMT adds a sparse product-key memory head after adaptation to curb forgetting in generative retrieval, but the slice-isolation claim needs tighter controls than the abstract provides.

read the letter

The core contribution is post-adaptation memory tuning: after fine-tuning a generative retrieval model on new documents, they attach a product-key memory head, freeze the backbone and output embeddings, and update only a budgeted subset of memory values chosen by decoding-time access frequency (high on the current slice, low on prior ones). Corrections are applied as residuals in hidden space during trie-constrained decoding. This is a concrete engineering move not covered in the GenIR or continual-learning citations they reference, and it directly targets the stability-plasticity issue they demonstrate with sequential increments on MS MARCO and Natural Questions.

Referee Report

3 major / 1 minor

Summary. The manuscript claims that generative retrieval models suffer from catastrophic forgetting when sequentially adapted to new disjoint document slices, and introduces post-adaptation memory tuning (PAMT) that attaches a product-key parametric memory head (PMH) with fixed addressing. During prefix-trie constrained decoding, decoder states sparsely query the PMH to produce residual corrections mapped through the frozen output embedding matrix; only a fixed budget of memory values is updated per session using decoding-time access statistics that favor current-slice activations and deprioritize prior-session usage. Experiments on sequential increments of MS MARCO and Natural Questions are reported to show substantially improved retention on earlier slices with minimal impact on new-document retrieval while modifying only a sparse subset of memory entries.

Significance. If the empirical isolation of slices holds under controlled conditions, the work would offer a practical, low-parameter solution to the stability-plasticity trade-off in model-as-index generative retrieval, enabling incremental corpus updates without full retraining or modular index maintenance. The use of access-statistic-driven sparse PMH updates is a targeted engineering contribution that could extend to other continual neural retrieval settings.

major comments (3)

[Abstract] Abstract: the central claim that PAMT 'substantially improves retention on earlier slices with minimal impact' is presented without any quantitative metrics, baseline comparisons (e.g., full fine-tuning, standard PEFT), number of slices, or statistical significance tests, leaving the magnitude and reliability of the reported gains impossible to assess from the provided description.
[Method] Method (PAMT description): the procedure for selecting the fixed-budget memory values via 'decoding-time access statistics' that prioritize 'frequently activated by the current slice and rarely used in prior sessions' is not specified; no exact estimator for prior usage, storage mechanism, or ablation on addressing collisions under query distribution shifts is given, yet this selection is load-bearing for the claim that cross-slice interference is blocked while freezing the backbone and output matrix.
[Experiments] Experiments: no ablation or analysis is described that tests whether the fixed product-key addressing produces sufficiently disjoint correction subspaces across slices, nor whether the sparse updates suffice to prevent interference when prior-usage tracking is approximate; this directly undermines the isolation guarantee that the central stability claim rests upon.

minor comments (1)

[Method] The notation for residual corrections and their mapping to score adjustments via the frozen output matrix should be formalized with equations to clarify the exact computation performed only over trie-valid tokens.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's report. We address each of the major comments in detail below, providing clarifications from the manuscript and indicating where revisions will strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that PAMT 'substantially improves retention on earlier slices with minimal impact' is presented without any quantitative metrics, baseline comparisons (e.g., full fine-tuning, standard PEFT), number of slices, or statistical significance tests, leaving the magnitude and reliability of the reported gains impossible to assess from the provided description.

Authors: We agree that the abstract would benefit from greater specificity to allow readers to immediately gauge the effect sizes. In the revised manuscript we will expand the abstract to report concrete retention improvements (e.g., average NDCG@10 retention on prior slices), direct comparisons against full fine-tuning and LoRA-style PEFT baselines, the exact number of sequential disjoint slices used in the protocol, and a note that all figures are means over three independent runs with standard deviations. revision: yes
Referee: [Method] Method (PAMT description): the procedure for selecting the fixed-budget memory values via 'decoding-time access statistics' that prioritize 'frequently activated by the current slice and rarely used in prior sessions' is not specified; no exact estimator for prior usage, storage mechanism, or ablation on addressing collisions under query distribution shifts is given, yet this selection is load-bearing for the claim that cross-slice interference is blocked while freezing the backbone and output matrix.

Authors: Section 3.3 of the manuscript already outlines that access statistics are maintained as per-entry counters incremented whenever a memory slot is queried during prefix-trie decoding on the current slice; prior-session usage is the cumulative counter value from earlier sessions, and the fixed-budget selection ranks entries by current activation frequency minus a linear penalty proportional to prior usage. Storage is a simple integer array. We concede that an explicit mathematical estimator and pseudocode were omitted for brevity. The revised version will add the precise ranking formula, pseudocode for the selection step, and a short paragraph discussing collision probability under moderate distribution shift. revision: yes
Referee: [Experiments] Experiments: no ablation or analysis is described that tests whether the fixed product-key addressing produces sufficiently disjoint correction subspaces across slices, nor whether the sparse updates suffice to prevent interference when prior-usage tracking is approximate; this directly undermines the isolation guarantee that the central stability claim rests upon.

Authors: The end-to-end results on MS MARCO and Natural Questions already demonstrate that retention on earlier slices remains high while new-slice performance is largely preserved, which is consistent with limited cross-slice interference. We acknowledge, however, that explicit ablations measuring subspace overlap (e.g., cosine similarity of correction vectors) or sensitivity to approximate usage tracking are absent. In the revision we will add an activation-overlap analysis across slices and a brief theoretical argument for why fixed product-key addressing plus the usage-prioritized selection heuristic reduces interference; a full controlled ablation study would require additional compute and we therefore mark this as a partial revision. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical method with no self-referential derivations or fitted predictions

full rationale

The paper proposes PAMT as an engineering solution: freeze the backbone, attach a product-key memory head with fixed addressing, and update a sparse subset of memory values selected by decoding-time access statistics. No equations, first-principles derivations, or predictions are presented that reduce to their own inputs by construction. Claims rest on experimental results on sequential MS MARCO/NQ increments rather than any self-definitional loop or self-citation chain. The central mechanism (sparse PMH updates to limit interference) is a procedural choice, not a quantity defined in terms of itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach relies on standard neural-network assumptions rather than new mathematical derivations; the memory head is an engineering addition built on existing product-key memory ideas.

axioms (1)

domain assumption Freezing backbone and output embedding parameters leaves routing and generation behavior intact except for the added memory corrections.
Invoked when the method states that only memory values are updated while keeping all other parameters fixed.

pith-pipeline@v0.9.0 · 5590 in / 1127 out tokens · 37795 ms · 2026-05-08T07:13:28.545906+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Conditional Memory Enhanced Item Representation for Generative Recommendation
cs.IR 2026-05 unverdicted novelty 6.0

ComeIR introduces dual-level Engram memory and memory-restoring prediction to reconstruct SID-token embeddings and restore token granularity in generative recommendation.

Reference graph

Works this paper leans on

57 extracted references · 22 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Vincent-Pierre Berges, Barlas Oğuz, Daniel Haziza, Wen-tau Yih, Luke Zettle- moyer, and Gargi Ghosh. 2024. Memory layers at scale.arXiv preprint arXiv:2412.09764(2024)

work page arXiv 2024
[2]

Daniel Fernando Campos, Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, Li Deng, and Bhaskar Mitra. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset.arXiv preprint arXiv:1611.09268(2016)

work page internal anchor Pith review arXiv 2016
[3]

Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2020. Au- toregressive Entity Retrieval. InProceedings of EMNLP

2020
[4]

Arslan Chaudhry, Marc’Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elho- seiny. 2019. Continual learning with tiny episodic memories. InICML

2019
[5]

Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yix- ing Fan, and Xueqi Cheng. 2023. Continual Learning for Generative Retrieval over Dynamic Corpora. InCIKM 2023: 32nd ACM International Conference on Information and Knowledge Management. ACM, 306–315

2023
[6]

Matthias De Lange, Rahaf Aljundi, Marc Masana, et al. 2022. A Continual Learning Survey: Defying Forgetting in Classification Tasks.IEEE Transactions on Pattern Analysis and Machine Intelligence(2022)

2022
[7]

Robert M French. 1999. Catastrophic forgetting in connectionist networks.Trends in cognitive sciences3, 4 (1999), 128–135

1999
[8]

Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. 2021. Transformer Feed-Forward Layers Are Key-Value Memories. InProceedings of the 2021 Con- ference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, Online and Punta Cana...

work page internal anchor Pith review doi:10.18653/v1/2021.emnlp-main.446 2021
[9]

Jiafeng Guo, Changjiang Zhou, Ruqing Zhang, Jiangui Chen, Maarten de Ri- jke, Yixing Fan, and Xueqi Cheng. 2024. CorpusBrain++: A Continual Genera- tive Pre-Training Framework for Knowledge-Intensive Language Tasks.ArXiv abs/2402.16767 (2024). https://api.semanticscholar.org/CorpusID:268032853

work page arXiv 2024
[10]

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. 2022. Lora: Low-rank adaptation of large language models.ICLR1, 2 (2022), 3

2022
[11]

Tuan-Luc Huynh, Thuy-Trang Vu, Weiqing Wang, Trung Le, Dragan Gašević, Yuan-Fang Li, and Thanh-Toan Do. 2025. MixLoRA-DSI: Dynamically Expandable Mixture-of-LoRA Experts for Rehearsal-Free Generative Retrieval over Dynamic Corpora.arXiv preprint arXiv:2507.09924(2025)

work page arXiv 2025
[12]

Tuan-Luc Huynh, Thuy-Trang Vu, Weiqing Wang, Yinwei Wei, Trung Le, Dragan Gasevic, Yuan-Fang Li, and Thanh-Toan Do. 2024. Promptdsi: Prompt-based rehearsal-free instance-wise incremental learning for document retrieval.arXiv preprint arXiv:2406.12593(2024)

work page arXiv 2024
[13]

David Isele and Akansel Cosgun. 2018. Selective Experience Replay for Lifelong Learning. InAAAI

2018
[14]

Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search.IEEE transactions on pattern analysis and machine intelligence33, 1 (2010), 117–128

2010
[15]

Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open- Domain Question Answering. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Com...

work page doi:10.18653/v1/2020.emnlp-main.550 2020
[16]

Gyuwan Kim and Tae Hwan Jung. 2020. Large product key memory for pretrained language models. InFindings of the Association for Computational Linguistics: EMNLP 2020. 4060–4069

2020
[17]

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences114, 13 (2017), 3521– 3526

2017
[18]

Varsha Kishore, Chao Wan, Justin Lovelace, Yoav Artzi, and Kilian Q Wein- berger. 2023. Incdsi: incrementally updatable document retrieval. InInternational Conference on Machine Learning. PMLR, 17122–17134

2023
[19]

https://aclanthology.org/ Q19-1026/

Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: A Benchmark for Question Answering Research.Tr...

work page doi:10.1162/tacl_a_00276 2019
[20]

Guillaume Lample, Alexandre Sablayrolles, Marc’Aurelio Ranzato, Ludovic De- noyer, and Hervé Jégou. 2019. Large memory layers with product keys.Advances in Neural Information Processing Systems32 (2019)

2019
[21]

Yongqi Li, Nan Yang, Liang Wang, Furu Wei, and Wenjie Li. 2023. Multiview Identifiers Enhanced Generative Retrieval. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023). 6636–6648

2023
[22]

Zhizhong Li and Derek Hoiem. 2017. Learning without forgetting. InIEEE transactions on pattern analysis and machine intelligence, Vol. 40. IEEE, 2935– 2947

2017
[23]

Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python Toolkit for Reproducible In- formation Retrieval Research with Sparse and Dense Representations. InPro- ceedings of the 44th International ACM SIGIR Conference on Research and De- velopment in Information Retrieval(Virtual Event, Canada)...

work page doi:10.1145/3404835.3463238 2021
[24]

Jessy Lin, Luke Zettlemoyer, Gargi Ghosh, Wen-Tau Yih, Aram Markosyan, Vincent-Pierre Berges, and Barlas Oğuz. 2025. Continual Learning via Sparse Memory Finetuning.arXiv preprint arXiv:2510.15103(2025)

work page arXiv 2025
[25]

Bo Liu, Qiang Liu, and Peter Stone. 2022. Continual learning and private unlearn- ing. InConference on Lifelong Learning Agents. PMLR, 243–254

2022
[26]

David Lopez-Paz and Marc’Aurelio Ranzato. 2017. Gradient episodic memory for continual learning. InAdvances in Neural Information Processing Systems. 6467–6476

2017
[27]

Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, and Xueqi Cheng. 2022. Pre- train a Discriminative Text Encoder for Dense Retrieval via Contrastive Span Prediction. InProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval(Madrid, Spain)(SIGIR ’22). Association for Computing Machinery, New York, NY,...

work page doi:10.1145/3477495.3531772 2022
[28]

Sourab Mangrulkar, Sylvain Gugger, Lysandre Debut, Younes Belkada, Sayak Paul, Benjamin Bossan, and Marian Tietz. 2022. PEFT: State-of-the-art Parameter- Efficient Fine-Tuning methods. https://github.com/huggingface/peft

2022
[29]

Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in con- nectionist networks: The sequential learning problem. InPsychology of learning and motivation. Vol. 24. Elsevier, 109–165

1989
[30]

Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q Tran, Jinfeng Rao, Marc Najork, Emma Strubell, and Donald Metzler. 2023. Dsi++: Updating transformer memory with new documents. InProceedings of the 2023 conference on empirical methods in natural language processing. 8198–8213

2023
[31]

Kidist Amde Mekonnen, Yubao Tang, and Maarten de Rijke. 2025. Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval(Padua, Italy)(SIGIR ’25). Association for Computing Machinery, New York, NY, USA, 1327–1...

work page arXiv 2025
[32]

Donald Metzler, Yi Tay, Dara Bahri, and Marc Najork. 2021. Rethinking search: making domain experts out of dilettantes.SIGIR Forum55, 1, Article 13 (July 2021), 27 pages. https://doi.org/10.1145/3476415.3476428

work page doi:10.1145/3476415.3476428 2021
[33]

Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2021. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question An- swering. InProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language T...

work page doi:10.18653/v1/2021.naacl-main.466 2021
[34]

Ruiyang Ren, Wayne Xin Zhao, Jing Liu, Hua Wu, Ji-Rong Wen, and Haifeng Wang. 2023. TOME: A Two-stage Approach for Model-based Retrieval. InProceed- ings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics,...

work page doi:10.18653/v1/2023.acl-long.336 2023
[35]

Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford

Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. 1994. Okapi at TREC-3. InProceedings of the Third Text REtrieval Conference (TREC-3). National Institute of Standards and Technology (NIST), Gaithersburg, Maryland, USA, 109–126. https://www.microsoft.com/en- us/research/publication/okapi-at-trec-3/

1994
[36]

Anthony V. Robins. 1995. Catastrophic Forgetting, Rehearsal and Pseudore- hearsal.Connect. Sci.7 (1995), 123–146. https://api.semanticscholar.org/CorpusID: 22882861

1995
[37]

Lillicrap, and Greg Wayne

David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy P. Lillicrap, and Greg Wayne. 2019.Experience replay for continual learning. Curran Associates Inc., Red Hook, NY, USA

2019
[38]

James Seale Smith, Junjiao Tian, Shaunak Halbe, Yen-Chang Hsu, and Zsolt Kira
[39]

InProceedings of the IEEE/CVF conference on computer vision and pattern recognition

A closer look at rehearsal-free continual learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2410–2420
[40]

Yubao Tang, Ruqing Zhang, Jiafeng Guo, Jiangui Chen, Zuowei Zhu, Shuaiqiang Wang, Dawei Yin, and Xueqi Cheng. 2023. Semantic-enhanced differentiable search index inspired by learning strategies. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4904–4913

2023
[41]

Yubao Tang, Ruqing Zhang, Jiafeng Guo, and Maarten de Rijke. 2023. Recent Advances in Generative Information Retrieval. InProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 294–297. SIGIR ’26, July 20–24, 2026, Melbourne, VIC, Australia Kidist Amde Mekonnen, Yubao T...

2023
[42]

Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, and Xueqi Cheng. 2024. Generative retrieval meets multi-graded relevance.Advances in Neural Information Processing Systems37 (2024), 72790–72817

2024
[43]

Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, and Xueqi Cheng. 2024. Listwise Generative Retrieval Models via a Sequential Learning Process.ACM Transactions on Information Systems42, 5 (2024), 1–31

2024
[44]

Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, and Xueqi Cheng. 2024. Bootstrapped Pre-training with Dynamic Identifier Prediction for Generative Retrieval. InFindings of the Association for Computational Linguistics ACL 2024. 10303–10317

2024
[45]

Yubao Tang, Ruqing Zhang, Zhaochun Ren, Jiafeng Guo, and Maarten de Rijke
[46]

Recent Advances in Generative Information Retrieval. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval(Washington DC, USA)(Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/3626772.3661379
[47]

Yubao Tang, Ruqing Zhang, Weiwei Sun, Jiafeng Guo, and Maarten de Rijke
[48]

InCompanion Proceedings of the ACM Web Conference 2024(Singapore, Singapore)(WWW ’24)

Recent Advances in Generative Information Retrieval. InCompanion Proceedings of the ACM Web Conference 2024(Singapore, Singapore)(WWW ’24). Association for Computing Machinery, New York, NY, USA, 1238–1241. https://doi.org/10.1145/3589335.3641239

work page doi:10.1145/3589335.3641239 2024
[49]

Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W

Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, and Donald Metzler. 2022. Transformer Memory as A Differentiable Search Index. In Proceedings of the 36th International Conference on Neural Information Processing Systems(New Orleans, LA, USA)(NIPS ’22). Curr...

2022
[50]

Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Hao Sun, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Allen Sun, Weiwei Deng, Qi Zhang, and Mao Yang. 2022. A Neural Corpus Indexer for Document Retrieval. InProceedings of the 36th International Conference on Neural Information Processing Systems(New Orleans, LA,...

2022
[51]

Hansi Zeng, Chen Luo, Bowen Jin, Sheikh Muhammad Sarwar, Tianxin Wei, and Hamed Zamani. 2024. Scalable and Effective Generative Information Retrieval. InProceedings of the ACM Web Conference 2024(Singapore, Singapore)(WWW ’24). Association for Computing Machinery, New York, NY, USA, 1441–1452. https://doi.org/10.1145/3589334.3645477

work page doi:10.1145/3589334.3645477 2024
[52]

Hansi Zeng, Chen Luo, and Hamed Zamani. 2024. Planning Ahead in Gen- erative Retrieval: Guiding Autoregressive Generation through Simultaneous Decoding. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval(Washington DC, USA)(SI- GIR ’24). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/3626772.3657746 2024
[53]

Zhen Zhang, Xinyu Ma, Weiwei Sun, Pengjie Ren, Zhumin Chen, Shuaiqiang Wang, Dawei Yin, Maarten de Rijke, and Zhaochun Ren. 2025. Replication and Exploration of Generative Retrieval over Dynamic Corpora. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval(Padua, Italy)(SIGIR ’25). Association f...

work page doi:10.1145/3726302 2025
[54]

Yujia Zhou, Zhicheng Dou, and Ji-Rong Wen. 2023. Enhancing Generative Retrieval with Reinforcement Learning from Relevance Feedback. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 12481–12490

2023
[55]

Yujia Zhou, Jing Yao, Zhicheng Dou, Yiteng Tu, Ledell Wu, Tat-Seng Chua, and Ji-Rong Wen. 2024. ROGER: Ranking-Oriented Generative Retrieval.ACM Trans. Inf. Syst.42, 6, Article 155 (Oct. 2024), 25 pages. https://doi.org/10.1145/3603167

work page doi:10.1145/3603167 2024
[56]

Yujia Zhou, Jing Yao, Zhicheng Dou, Ledell Wu, Peitian Zhang, and Ji-Rong Wen
[57]

Ultron: An ultimate retriever on corpus with a model-based indexer.arXiv preprint arXiv:2208.09257(2022)

work page arXiv 2022