arxiv: 2603.14259 · v2 · submitted 2026-03-15 · 💻 cs.IR · cs.AI

Recognition: no theorem link

GenRecEdit: Adapting Model Editing for Generative Recommendation with Cold-Start Items

Chenglei Shen , Teng Shi , Weijie Yu , Xiao Zhang , Jun Xu

Authors on Pith no claims yet

Pith reviewed 2026-05-15 11:48 UTC · model grok-4.3

classification 💻 cs.IR cs.AI

keywords generative recommendationmodel editingcold-start itemssequential recommendationknowledge injectionefficient model updates

0 comments

The pith

GenRecEdit adapts model editing to inject cold-start items into generative recommenders, raising their accuracy while preserving original performance and using only 9.5 percent of full retraining time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Generative recommendation models generate next-item suggestions directly from user sequences but collapse on new items that never appeared in training. The paper demonstrates that ideas from editing large language models can be repurposed to add these cold-start items after the model is already trained. The method works by explicitly linking full user history to next-token prediction, then performing repeated single-token edits guided by a one-to-one trigger that limits interference among many new items. If the approach holds, recommendation systems could update far more often when catalogs change rapidly, without waiting for expensive full retraining or suffering from sparse new-item feedback.

Core claim

GenRecEdit explicitly models the mapping from sequence context to next-token generation, applies iterative token-level editing to insert multi-token item representations, and uses a one-to-one trigger mechanism to prevent cross-edit interference, thereby lifting cold-start recommendation accuracy while leaving performance on previously seen items unchanged and requiring only about 9.5 percent of the compute needed for retraining.

What carries the argument

Iterative token-level editing paired with a one-to-one trigger mechanism that injects multi-token item representations into the generative next-token predictor.

If this is right

Cold-start items can be added to a live generative recommender without collecting large new-interaction datasets.
Catalog updates become feasible on a much shorter cycle than full retraining allows.
Multiple cold-start items can be injected in one pass without mutual interference during inference.
The original model's accuracy on previously seen items remains essentially unchanged after the edits.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Production systems could combine periodic light editing for new items with occasional full retraining for major distribution shifts.
The same editing pattern might transfer to other generative sequential models outside recommendation, such as next-action predictors in user interfaces.
A practical test would measure how many simultaneous cold-start edits the one-to-one trigger can sustain before interference appears in long user histories.

Load-bearing premise

Iterative token-level editing combined with the one-to-one trigger can reliably insert multi-token item representations without causing unintended interference or degrading performance on items that were already in the model.

What would settle it

Apply GenRecEdit to a trained generative model and measure whether cold-start item hit rate stays near zero or whether accuracy on warm items drops measurably; either outcome would falsify the central claim.

Figures

Figures reproduced from arXiv: 2603.14259 by Chenglei Shen, Jun Xu, Teng Shi, Weijie Yu, Xiao Zhang.

**Figure 2.** Figure 2: An illustration of the challenges adapting model [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: The analysis of the cold-start collapse. Left: NDCG [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Overall framework of GenRecEdit, which consists of three main modules: (1) Position-Wise Knowledge Preparation. [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 6.** Figure 6: A sensitivity analysis on the number of constructed [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: An analysis on the hyper parameter 𝜆. The left figure shows the performance trends on cold-start and warm items (NDCG@10), while the right reports the performance on the overall test set (NDCG@10 and NDCG@20). per cold-start item. As shown in [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 5.** Figure 5: An analysis of the quality of constructed knowledge. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 8.** Figure 8: Classfier Accuracy across layers. The left figure [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

read the original abstract

Generative recommendation (GR) has shown strong potential for sequential recommendation in an end-to-end generation paradigm. However, existing GR models suffer from severe cold-start collapse: their recommendation accuracy on cold-start items can drop to near zero. Current solutions typically rely on retraining with cold-start interactions, which is hindered by sparse feedback, high computational cost, and delayed updates, limiting practical utility in rapidly evolving recommendation catalogs. Inspired by model editing in NLP, which enables training-free knowledge injection into large language models, we explore how to bring this paradigm to generative recommendation. This, however, faces two key challenges: GR lacks the explicit subject-object binding common in natural language, making targeted edits difficult; and GR does not exhibit stable token co-occurrence patterns, making the injection of multi-token item representations unreliable. To address these challenges, we propose GenRecEdit, a model editing framework tailored for generative recommendation. GenRecEdit explicitly models the relationship between the full sequence context and next-token generation, adopts iterative token-level editing to inject multi-token item representations, and introduces a one-to-one trigger mechanism to reduce interference among multiple edits during inference. Extensive experiments on multiple datasets show that GenRecEdit substantially improves recommendation performance on cold-start items while preserving the model's original recommendation quality. Moreover, it achieves these gains using only about 9.5% of the training time required for retraining, enabling more efficient and frequent model updates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GenRecEdit adapts model editing for cold-start items in generative rec with targeted fixes, but the isolation of edits during inference still needs concrete checks.

read the letter

GenRecEdit adapts model editing for cold-start items in generative rec with targeted fixes, but the isolation of edits during inference still needs concrete checks. The authors take techniques from NLP editing and adjust them for sequences by explicitly linking full context to next-token output, running edits token by token to handle multi-token items, and adding a one-to-one trigger to cut interference when several new items are active at once. This combination is new for the GR setting and directly tackles why off-the-shelf editing falls short here. The efficiency angle is practical: they report solid lifts on cold items while keeping the original model intact, all at roughly 9.5 percent of the cost of full retraining. That matches a real deployment need in catalogs that keep adding items. The approach shows clear thinking on the differences between language and recommendation sequences. The main soft spot is verification. The abstract claims no degradation on warm items and reduced interference, yet we have no visible ablations, data splits, or metrics like KL shifts on non-target tokens under simultaneous edits. The stress-test point holds weight here—if the trigger only maps at insertion time without constraining later attention in the decoder, leakage could appear exactly when many cold items are present. Full experiments would need to show that non-edited sequences stay stable. This paper is for people building or studying generative sequential recommenders who need fast updates without retraining. It engages the editing literature honestly and focuses on a concrete pain point. It deserves a serious referee to examine the mechanics and results in detail.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes GenRecEdit, an adaptation of NLP model editing techniques to generative recommendation (GR) models. It targets cold-start item collapse by modeling sequence-to-next-token relationships, applying iterative token-level editing to inject multi-token item representations, and introducing a one-to-one trigger mechanism to limit interference among edits at inference time. Experiments across multiple datasets report substantial gains on cold-start items, preservation of original performance on warm items, and efficiency at roughly 9.5% of full retraining cost.

Significance. If the isolation properties of the trigger and the stability of the edits hold under realistic catalog-update regimes, the work provides a practical route to frequent, low-cost updates of GR models without sacrificing accuracy on established items. The efficiency claim and the explicit handling of multi-token representations distinguish it from simple fine-tuning baselines.

major comments (3)

[§4.3] §4.3 (One-to-one trigger mechanism): the claim that the trigger 'reduce[s] interference among multiple edits' is central to the method, yet no isolation metric (KL divergence on non-target next-token distributions, or NDCG delta on warm items under simultaneous multi-item edits) is reported. Without such a measurement, cross-item leakage in the decoder's attention over edited context cannot be ruled out.
[§5.2] §5.2 (Experimental setup): the definition of cold-start items (interaction count threshold, temporal split details) and the construction of the test sets for simultaneous multi-cold-item scenarios are not fully specified. These details are load-bearing for reproducing the reported gains and for assessing whether the one-to-one trigger scales beyond the evaluated catalog sizes.
[§4.1] §4.1 (Iterative token-level editing): the update rule for injecting a multi-token item representation is presented without an explicit bound on how many iterations are required for convergence or on the magnitude of parameter change per token. This leaves open whether the procedure remains training-free in the sense claimed when item embeddings exceed a few tokens.

minor comments (2)

[Abstract] Abstract: the efficiency figure is given as 'about 9.5%'; report the exact mean and standard deviation across the three datasets for reproducibility.
[§3] Notation in §3: the mapping from the two stated GR-specific challenges to the three proposed components could be tabulated for clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive review. We appreciate the focus on strengthening the evidence for the trigger's isolation, clarifying experimental details for reproducibility, and analyzing the editing procedure's convergence. We address each point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [§4.3] §4.3 (One-to-one trigger mechanism): the claim that the trigger 'reduce[s] interference among multiple edits' is central to the method, yet no isolation metric (KL divergence on non-target next-token distributions, or NDCG delta on warm items under simultaneous multi-item edits) is reported. Without such a measurement, cross-item leakage in the decoder's attention over edited context cannot be ruled out.

Authors: We acknowledge that an explicit isolation metric would strengthen the central claim. Our experiments already show that warm-item NDCG remains stable (within 1-2% of the unedited baseline) even under simultaneous multi-edit inference, providing indirect evidence against substantial leakage. To directly quantify this, we will add KL divergence measurements between original and post-edit next-token distributions for non-target items in the revised Section 4.3, along with the requested NDCG delta results. revision: yes
Referee: [§5.2] §5.2 (Experimental setup): the definition of cold-start items (interaction count threshold, temporal split details) and the construction of the test sets for simultaneous multi-cold-item scenarios are not fully specified. These details are load-bearing for reproducing the reported gains and for assessing whether the one-to-one trigger scales beyond the evaluated catalog sizes.

Authors: We agree these specifications are essential for reproducibility. Cold-start items are defined as those with fewer than 5 interactions; we use a temporal split with the most recent 20% of interactions held out for testing. Multi-cold-item test sets are built by sampling sequences containing 2-3 cold items inserted into warm contexts. We will expand Section 5.2 with these exact thresholds, split ratios, and construction procedure (including pseudocode) to allow assessment of scaling. revision: yes
Referee: [§4.1] §4.1 (Iterative token-level editing): the update rule for injecting a multi-token item representation is presented without an explicit bound on how many iterations are required for convergence or on the magnitude of parameter change per token. This leaves open whether the procedure remains training-free in the sense claimed when item embeddings exceed a few tokens.

Authors: The procedure is training-free because each step applies a closed-form update without gradients or optimization loops. In our experiments, items with 2-5 tokens converge in 3-5 iterations with per-token parameter changes below 0.05 in L2 norm. We will add an empirical convergence analysis and bound discussion to the revised Section 4.1, noting that the method scales efficiently for typical recommendation item lengths while remaining training-free. revision: partial

Circularity Check

0 steps flagged

No circularity: adaptation of external model-editing techniques with independent experimental validation

full rationale

The paper adapts model-editing methods from NLP to generative recommendation by introducing iterative token-level editing and a one-to-one trigger mechanism. No equations, parameters, or central claims reduce by construction to fitted inputs, self-defined quantities, or self-citation chains. The derivation relies on explicit modeling of sequence-to-next-token relationships and is validated through experiments on multiple datasets showing performance gains and efficiency improvements, rather than definitional equivalence. Self-citations, if present, are not load-bearing for the core claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that model editing can be successfully ported to GR after addressing the stated structural differences; no explicit free parameters or invented entities are quantified in the abstract.

axioms (1)

domain assumption Model editing techniques developed for natural language can be adapted to generative recommendation despite the absence of explicit subject-object bindings and stable token co-occurrences.
Invoked to justify the feasibility of the proposed framework.

invented entities (1)

One-to-one trigger mechanism no independent evidence
purpose: Reduce interference among multiple edits during inference
Introduced as a core component of GenRecEdit to enable reliable multi-edit scenarios.

pith-pipeline@v0.9.0 · 5563 in / 1240 out tokens · 41039 ms · 2026-05-15T11:48:44.321036+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 6 internal anchors

[1]

Yonatan Belinkov. 2022. Probing classifiers: Promises, shortcomings, and ad- vances.Computational Linguistics48, 1 (2022), 207–219. GenRecEdit : Adapting Model Editing for Generative Recommendation with Cold-Start Items SIGIR ’26, July 20–24, 2026, Melbourne, VIC, Australia

work page 2022
[2]

Sirui Chen, Yuan Wang, Zijing Wen, Zhiyu Li, Changshuo Zhang, Xiao Zhang, Quan Lin, Cheng Zhu, and Jun Xu. 2023. Controllable multi-objective re-ranking with policy hypernetworks. InProceedings of the 29th ACM SIGKDD conference on knowledge discovery and data mining. 3855–3864

work page 2023
[3]

Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, and Furu Wei. 2022. Knowledge neurons in pretrained transformers. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 8493–8502

work page 2022
[4]

Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou. 2025. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2502.18965 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[5]

Yijie Ding, Jiacheng Li, Julian McAuley, and Yupeng Hou. 2026. Inductive genera- tive recommendation via retrieval-based speculation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 14675–14683

work page 2026
[6]

Junfeng Fang, Houcheng Jiang, Kun Wang, Yunshan Ma, Shi Jie, Xiang Wang, Xiangnan He, and Tat-Seng Chua. 2024. Alphaedit: Null-space constrained knowledge editing for language models.arXiv preprint arXiv:2410.02355(2024)

work page arXiv 2024
[7]

Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. 2021. Transformer feed-forward layers are key-value memories. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 5484–5495

work page 2021
[8]

Yupeng Hou, Zhankui He, Julian McAuley, and Wayne Xin Zhao. 2023. Learning vector-quantized item representation for transferable sequential recommenders. InProceedings of the ACM Web Conference 2023. 1162–1171

work page 2023
[9]

Yupeng Hou, Jiacheng Li, Zhankui He, An Yan, Xiusi Chen, and Julian McAuley

work page
[10]

Bridging language and items for retrieval and recommendation.arXiv preprint arXiv:2403.03952(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[11]

Yupeng Hou, Jiacheng Li, Ashley Shin, Jinsung Jeon, Abhishek Santhanam, Wei Shao, Kaveh Hassani, Ning Yao, and Julian McAuley. 2025. Generating long semantic IDs in parallel for recommendation. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 956–966

work page 2025
[12]

Yupeng Hou, An Zhang, Leheng Sheng, Jiancan Wu, Xiang Wang, Tat-Seng Chua, and Julian McAuley. 2025. Towards Large Generative Recommendation: A Tokenization Perspective. InProceedings of the 34th ACM International Conference on Information and Knowledge Management. 6821–6824

work page 2025
[13]

Wenyue Hua, Shuyuan Xu, Yingqiang Ge, and Yongfeng Zhang. 2023. How to index item ids for recommendation foundation models. InProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 195–204

work page 2023
[14]

Houcheng Jiang, Junfeng Fang, Ningyu Zhang, Guojun Ma, Mingyang Wan, Xiang Wang, Xiangnan He, and Tat-Seng Chua. 2025. AnyEdit: Edit Any Knowledge Encoded in Language Models.CoRR(2025)

work page 2025
[15]

Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recom- mendation. In2018 IEEE International Conference on Data Mining (ICDM). IEEE, 197–206

work page 2018
[16]

Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, and Martin Wat- tenberg. 2023. Inference-time intervention: Eliciting truthful answers from a language model.Advances in Neural Information Processing Systems36 (2023)

work page 2023
[17]

Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[18]

Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. 2022. Locating and editing factual associations in gpt.Advances in neural information processing systems35 (2022), 17359–17372

work page 2022
[19]

Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, and David Bau. 2022. Mass-editing memory in a transformer.arXiv preprint arXiv:2210.07229 (2022)

work page internal anchor Pith review Pith/arXiv arXiv 2022
[20]

Jianmo Ni, Gustavo Hernandez Abrego, Noah Constant, Ji Ma, Keith B Hall, Daniel Cer, and Yinfei Yang. 2021. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models.arXiv preprint arXiv:2108.08877(2021)

work page arXiv 2021
[21]

Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Ming He, Jianping Fan, Xiao Zhang, and Jun Xu. 2025. Maps: Motivation-aware personalized search via llm- driven consultation alignment. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 3039–3051

work page 2025
[22]

Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Xiao Zhang, Ming He, Jianping Fan, and Jun Xu. 2025. More: A mixture of reflectors framework for large language model-based sequential recommendation. InProceedings of the Nineteenth ACM Conference on Recommender Systems. 299–308

work page 2025
[23]

Changle Qu, Liqin Zhao, Yanan Niu, Xiao Zhang, and Jun Xu. 2025. Bridging Short Videos and Streamers with Multi-Graph Contrastive Learning for Live Streaming Recommendation. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2059–2069

work page 2025
[24]

Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al

work page
[25]

Recommender systems with generative retrieval.Advances in Neural Information Processing Systems36 (2023), 10299–10315

work page 2023
[26]

Chenglei Shen, Yi Zhan, Weijie Yu, Xiao Zhang, and Jun Xu. 2026. Enhancing Bandit Algorithms with LLMs for Time-varying User Preferences in Streaming Recommendations.ACM Transactions on Information Systems44, 3 (2026), 1–30

work page 2026
[27]

Chenglei Shen, Xiao Zhang, Teng Shi, Changshuo Zhang, Guofu Xie, Jun Xu, Ming He, and Jianping Fan. 2026. A survey of controllable learning: Methods and applications in information retrieval.Frontiers of Computer Science20, 10 (2026), 2010619

work page 2026
[28]

Chenglei Shen, Xiao Zhang, Wei Wei, and Jun Xu. 2023. Hyperbandit: Contex- tual bandit with hypernewtork for time-varying user preferences in streaming recommendation. InProceedings of the 32nd ACM International Conference on Information and Knowledge Management. 2239–2248

work page 2023
[29]

Chenglei Shen, Jiahao Zhao, Xiao Zhang, Weijie Yu, Ming He, and Jianping Fan

work page
[30]

InProceedings of the Nineteenth ACM Conference on Recommender Systems

Paragon: Parameter Generation for Controllable Multi-Task Recommenda- tion. InProceedings of the Nineteenth ACM Conference on Recommender Systems. 370–380

work page
[31]

Teng Shi, Chenglei Shen, Weijie Yu, Shen Nie, Chongxuan Li, Xiao Zhang, Ming He, Yan Han, and Jun Xu. 2025. LLaDA-Rec: Discrete Diffusion for Parallel Seman- tic ID Generation in Generative Recommendation.arXiv preprint arXiv:2511.06254 (2025)

work page arXiv 2025
[32]

Zihua Si, Zhongxiang Sun, Jiale Chen, Guozhang Chen, Xiaoxue Zang, Kai Zheng, Yang Song, Xiao Zhang, Jun Xu, and Kun Gai. 2024. Generative retrieval with semantic tree-structured identifiers and contrastive learning. InProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific...

work page 2024
[34]

InProceedings of the 28th ACM international conference on information and knowledge management

BERT4Rec: Sequential recommendation with bidirectional encoder rep- resentations from transformer. InProceedings of the 28th ACM international conference on information and knowledge management. 1441–1450

work page
[35]

Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang

work page
[36]

InProceedings of the 28th ACM International Conference on Information and Knowledge Management(Beijing, China)(CIKM ’19)

BERT4Rec: Sequential Recommendation with Bidirectional Encoder Rep- resentations from Transformer. InProceedings of the 28th ACM International Conference on Information and Knowledge Management(Beijing, China)(CIKM ’19). ACM, New York, NY, USA, 1441–1450

work page
[37]

Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, See- Kiong Ng, and Tat-Seng Chua. 2024. Learnable item tokenization for generative recommendation. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management. 2400–2409

work page 2024
[38]

Ye Wang, Jiahao Xun, Minjie Hong, Jieming Zhu, Tao Jin, Wang Lin, Haoyuan Li, Linjun Li, Yan Xia, Zhou Zhao, et al . 2024. Eager: Two-stream generative recommender with behavior-semantic collaboration. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3245–3254

work page 2024
[39]

Liu Yang, Fabian Paischer, Kaveh Hassani, Jiacheng Li, Shuai Shao, Zhang Gabriel Li, Yun He, Xue Feng, Nima Noorshams, Sem Park, et al . 2024. Unifying gen- erative and dense retrieval for sequential recommendation.arXiv preprint arXiv:2411.18814(2024)

work page arXiv 2024
[40]

Zhaoqi Yang, Yanan Wang, and Yong Ge. 2024. TTT4Rec: A Test-Time Training Approach for Rapid Adaption in Sequential Recommendation.arXiv preprint arXiv:2409.19142(2024)

work page arXiv 2024
[41]

Jin Zeng, Yupeng Qi, Hui Li, Chengming Li, Ziyu Lyu, Lixin Cui, and Lu Bai. 2026. RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation.arXiv preprint arXiv:2603.00638(2026)

work page arXiv 2026
[42]

Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhao- jie Gong, Fangda Gu, Jiayuan He, et al. 2024. Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations. In International Conference on Machine Learning. PMLR, 58484–58509

work page 2024
[43]

Changshuo Zhang, Xiao Zhang, Teng Shi, Jun Xu, and Ji-Rong Wen. 2025. Test- Time Alignment with State Space Model for Tracking User Interest Shifts in Sequential Recommendation. InProceedings of the Nineteenth ACM Conference on Recommender Systems. 461–471

work page 2025
[44]

Zhen Zhang, Zihan Wang, Xinyu Ma, Shuaiqiang Wang, Dawei Yin, Xin Xin, Pengjie Ren, Maarten de Rijke, and Zhaochun Ren. 2026. Model Editing for New Document Integration in Generative Information Retrieval.arXiv preprint arXiv:2603.02773(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[45]

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. 2023. A survey of large language models.arXiv preprint arXiv:2303.182231, 2 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[46]

Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ming Chen, and Ji-Rong Wen. 2024. Adapting large language models by integrating collaborative semantics for recommendation. In2024 IEEE 40th International Conference on Data Engineering (ICDE). IEEE, 1435–1448

work page 2024
[47]

Guorui Zhou, Jiaxin Deng, Jinghao Zhang, Kuo Cai, Lejian Ren, Qiang Luo, Qian- qian Wang, Qigen Hu, Rui Huang, Shiyao Wang, et al. 2025. OneRec Technical Report.arXiv preprint arXiv:2506.13695(2025)

work page arXiv 2025
[48]

Jieming Zhu, Mengqun Jin, Qijiong Liu, Zexuan Qiu, Zhenhua Dong, and Xiu Li

work page
[49]

InProceedings of the 18th ACM Conference on Recommender Systems

Cost: Contrastive quantization based semantic tokenization for generative recommendation. InProceedings of the 18th ACM Conference on Recommender Systems. 969–974

work page