pith. sign in

arxiv: 2605.18771 · v1 · pith:JDZWNIMTnew · submitted 2026-04-16 · 💻 cs.IR

LWGR: Lagrangian-Constrained Personalized World Knowledge for Generative Recommendation

Pith reviewed 2026-05-21 01:26 UTC · model grok-4.3

classification 💻 cs.IR
keywords generative recommendationlarge language modelspersonalized knowledgeLagrangian optimizationknowledge fusionrecommendation systemsprimal-dual method
0
0 comments X

The pith

LWGR uses Lagrangian constraints to selectively fuse personalized LLM world knowledge into generative recommenders while bounding performance loss.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets two problems in LLM-based generative recommendation: fixed instructions fail to reflect the many different dimensions of a user's interests, and adding knowledge without control can clash with actual user behavior data. LWGR fixes this by generating personalized soft instructions that pull out only behavior-relevant world knowledge from the LLM, then treats the act of fusing that knowledge as a formal optimization task. The optimization explicitly limits how much recommendation quality can degrade, and a primal-dual Lagrangian solver decides which pieces of knowledge to keep. Experiments show the result beats eight prior methods by as much as 11.23 percent on public data and lifts revenue 1.35 percent on a live advertising platform.

Core claim

LWGR shows that framing knowledge fusion as an optimization problem subject to an explicit bound on performance degradation, solved by a Lagrangian primal-dual method, lets generative recommenders incorporate only the beneficial parts of personalized LLM world knowledge without harming the underlying behavioral signals.

What carries the argument

The Lagrangian primal-dual solver that enforces a hard upper limit on recommendation-performance degradation while optimizing which pieces of personalized LLM knowledge to retain.

If this is right

  • Personalized soft instructions capture multidimensional user interests more effectively than fixed manual prompts.
  • Bounded optimization prevents irrelevant or conflicting LLM knowledge from overriding behavioral signals.
  • Separate training strategies allow the same framework to work with both small and large LLMs.
  • Nearline precomputation plus lightweight online serving makes the method practical for industrial scale.
  • Improved metrics translate into measurable revenue gains on real advertising platforms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bounded-optimization idea could be reused in any LLM-augmented system where external knowledge must not override core task signals.
  • Tightening or relaxing the degradation bound offers a tunable knob for trading knowledge richness against stability that future systems could expose to practitioners.
  • If the method works across LLM sizes, it reduces the need for hand-crafted prompts in recommendation pipelines.

Load-bearing premise

The Lagrangian solver can reliably pick only helpful knowledge without creating optimization instability or needing hyperparameter choices that themselves create the reported gains.

What would settle it

An ablation that removes the Lagrangian constraints or sets the allowed degradation bound to zero and measures whether the reported gains over baselines disappear on the same datasets.

Figures

Figures reproduced from arXiv: 2605.18771 by Haibo Xing, Hao Deng, Jinxin Hu, Kaican Lin, Lingyu Mu, Xiaoyi Zeng, Yu Zhang, Zheng Lin, Zhengxiao Liu, Zhitong Zhu.

Figure 1
Figure 1. Figure 1: Existing two-stage knowledge fusion GR based [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: (a) General and demographic-guided instruction [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of LWGR: a framework that integrates user personalized world knowledge into GR under Lagrangian [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The online and nearline deployment of LWGR. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: R/N@5 change under different codebook sizes [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Recent progress in large language model (LLM) based generative recommendation (GR) shows that leveraging LLM world knowledge can substantially improve performance. However, existing methods rely on fixed, manually designed instructions to generate semantic knowledge and directly incorporate it into GR, which has two limitations. First, fixed instructions cannot capture the multidimensional heterogeneity of user interests. Second, uncontrollable knowledge fusion may conflict with behavioral signals and harm recommendations. To address these limitations, we propose LWGR, a framework that leverages Lagrangian constraints to transfer users' personalized world knowledge from LLMs into generative recommendation. LWGR enhances GR along two axes: knowledge extraction and fusion. It builds personalized soft instructions to extract behavior-relevant LLM world knowledge, and formulates knowledge fusion as an optimization problem with explicitly bounded performance degradation, which is solved by a Lagrangian primal-dual method to selectively incorporate beneficial knowledge. We further design two training strategies for different LLM scales and a deployment scheme that combines nearline precomputation with lightweight online serving. Experiments on multiple public datasets and one industrial dataset show that LWGR outperforms eight state-of-the-art baselines by up to 11.23% and brings a 1.35% revenue lift on a large-scale advertising platform, demonstrating its effectiveness and practicality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes LWGR, a framework for generative recommendation that extracts personalized world knowledge from LLMs via soft instructions and fuses it through a Lagrangian-constrained optimization problem solved by a primal-dual method to bound performance degradation. It claims this selectively incorporates beneficial knowledge while avoiding conflicts with behavioral signals. Experiments on public datasets and one industrial advertising dataset report outperformance over eight SOTA baselines by up to 11.23% and a 1.35% revenue lift.

Significance. If the central optimization claim holds with proper controls, the work offers a principled mechanism for safe LLM knowledge integration in recommendation, with clear industrial applicability shown via the revenue metric. The combination of personalized extraction and bounded fusion addresses a practical limitation in current GR methods.

major comments (3)
  1. [§3.2] §3.2 (Lagrangian formulation): the performance degradation bound is presented as explicitly controlled, but no analysis of primal-dual convergence, dual-variable initialization, or sensitivity to the bound hyperparameter is provided; this is load-bearing because the abstract attributes gains specifically to selective incorporation via the constraint.
  2. [Experiments] Experiments section, Table 2/3: reported lifts (e.g., 11.23%) lack error bars, statistical significance tests, or details on whether the Lagrangian bound was tuned post-hoc on the evaluation sets; without these, robustness of the outperformance claim cannot be assessed.
  3. [Ablation studies] Ablation studies: no experiment isolates the primal-dual solver from the personalized soft instructions and training strategies; if tuning of the solver itself drives the metrics, the 1.35% revenue lift cannot be attributed to the Lagrangian mechanism.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'up to 11.23%' should specify the exact dataset and metric (e.g., HR@10 on which public dataset).
  2. [§3] Notation in §3: the definition of the soft instruction embedding and its interaction with the Lagrangian multiplier could be clarified with an explicit equation reference.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate the revisions we will make to strengthen the presentation of the Lagrangian mechanism, experimental robustness, and component contributions.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Lagrangian formulation): the performance degradation bound is presented as explicitly controlled, but no analysis of primal-dual convergence, dual-variable initialization, or sensitivity to the bound hyperparameter is provided; this is load-bearing because the abstract attributes gains specifically to selective incorporation via the constraint.

    Authors: We agree that additional analysis would improve rigor. In the revised manuscript we will add a new paragraph in §3.2 that (i) reports empirical convergence curves of the primal-dual iterates on the public datasets, (ii) states that dual variables are initialized to zero (standard for this class of problems) and shows the effect of alternative initializations, and (iii) presents a sensitivity plot of recommendation metrics versus the bound hyperparameter λ over a range that includes the value used in the main experiments. These additions will directly support the claim that gains arise from selective, bounded incorporation rather than unconstrained fusion. revision: yes

  2. Referee: [Experiments] Experiments section, Table 2/3: reported lifts (e.g., 11.23%) lack error bars, statistical significance tests, or details on whether the Lagrangian bound was tuned post-hoc on the evaluation sets; without these, robustness of the outperformance claim cannot be assessed.

    Authors: We acknowledge the need for statistical reporting. We will update Tables 2 and 3 to include standard deviations computed over five independent runs with different random seeds. We will also add the results of paired statistical tests (t-test or Wilcoxon signed-rank) between LWGR and each baseline, reporting p-values. Finally, we will clarify in the experimental setup that the bound hyperparameter λ was selected on a held-out validation split that is disjoint from the test sets used for final reporting, and we will list the candidate values examined during tuning. These changes will allow readers to assess the robustness of the reported lifts. revision: yes

  3. Referee: [Ablation studies] Ablation studies: no experiment isolates the primal-dual solver from the personalized soft instructions and training strategies; if tuning of the solver itself drives the metrics, the 1.35% revenue lift cannot be attributed to the Lagrangian mechanism.

    Authors: We agree that isolating the contribution of the primal-dual solver is important. In the revised ablation section we will add a controlled experiment that keeps the personalized soft-instruction extraction and training strategies fixed while replacing the Lagrangian primal-dual solver with an unconstrained baseline (direct knowledge fusion without the degradation bound). We will report the resulting performance drop on both public and industrial datasets. This will allow attribution of the observed gains, including the 1.35% revenue lift, to the constrained optimization rather than to other components or hyperparameter tuning. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation uses standard Lagrangian optimization independent of fitted inputs

full rationale

The paper formulates knowledge fusion as a constrained optimization problem solved by a Lagrangian primal-dual method, a standard technique from optimization theory that does not reduce to the paper's own data or self-citations by construction. Performance claims rest on empirical comparisons against eight baselines across public and industrial datasets rather than any prediction that is statistically forced by the same fitted parameters used for evaluation. No self-definitional elements, uniqueness theorems imported from prior author work, or ansatz smuggling via citation are present in the abstract or described framework. The central claim remains externally falsifiable through the reported experiments and revenue lift metric.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on standard convex optimization assumptions for the Lagrangian solver and the premise that LLM world knowledge can be selectively beneficial when constrained; no new entities are postulated.

free parameters (1)
  • performance degradation bound
    Explicit upper limit on recommendation quality drop used to constrain knowledge fusion; value chosen to balance incorporation versus stability.
axioms (1)
  • standard math Lagrangian primal-dual method converges to a feasible solution that respects the performance bound
    Invoked when formulating knowledge fusion as a constrained optimization problem.

pith-pipeline@v0.9.0 · 5776 in / 1252 out tokens · 22824 ms · 2026-05-21T01:26:36.124345+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 7 internal anchors

  1. [1]

    Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report.arXiv preprint arXiv:2303.08774 (2023)

  2. [2]

    Ronald C Arkin. 1990. Integrating behavioral, perceptual, and world knowledge in reactive navigation.Robotics and autonomous systems6, 1-2 (1990), 105–122

  3. [3]

    Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, et al. 2023. Qwen technical report.arXiv preprint arXiv:2309.16609(2023)

  4. [4]

    Neil Burgess, Jelena Milanovic, Nigel Stephens, Konstantinos Monachopoulos, and David Mansell. 2019. Bfloat16 processing for neural networks. In2019 IEEE 26th Symposium on Computer Arithmetic (ARITH). IEEE, 88–91

  5. [5]

    Hao Deng, Haibo Xing, Kanefumi Matsuyama, Yulei Huang, Jinxin Hu, Hong Wen, Jia Xu, Zulong Chen, Yu Zhang, Xiaoyi Zeng, et al. 2025. Heterrec: Heterogeneous information transformer for scalable sequential recommendation. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3020–3024

  6. [6]

    Hao Deng, Haibo Xing, Kanefumi Matsuyama, Moyu Zhang, Jinxin Hu, Hong Wen, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. 2025. CSMF: Cascaded Selective Mask Fine-Tuning for Multi-Objective Embedding-Based Retrieval. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2122–2131

  7. [7]

    Zhen Dong, Zhewei Yao, Daiyaan Arfeen, Amir Gholami, Michael W Mahoney, and Kurt Keutzer. 2020. Hawq-v2: Hessian aware trace-weighted quantization of neural networks.Advances in neural information processing systems33 (2020), 18518–18529

  8. [8]

    Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized product quantization.IEEE transactions on pattern analysis and machine intelligence36, 4 (2013), 744–755

  9. [9]

    Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Mingwei Chang. 2020. Retrieval augmented language model pre-training. InInternational conference on machine learning. PMLR, 3929–3938

  10. [10]

    Peter Hagoort, Lea Hald, Marcel Bastiaansen, and Karl Magnus Petersson. 2004. Integration of word meaning and world knowledge in language comprehension. science304, 5669 (2004), 438–441

  11. [11]

    Ruining He and Julian McAuley. 2016. Ups and Downs: Modeling the Visual Evo- lution of Fashion Trends with One-Class Collaborative Filtering. InProceedings of the 25th International Conference on World Wide Web(Montréal, Québec, Canada) (WWW ’16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 507–517. doi...

  12. [12]

    Yupeng Hou, Zhankui He, Julian McAuley, and Wayne Xin Zhao. 2023. Learning vector-quantized item representation for transferable sequential recommenders. InProceedings of the ACM Web Conference 2023. 1162–1171

  13. [13]

    Yupeng Hou, Jiacheng Li, Ashley Shin, Jinsung Jeon, Abhishek Santhanam, Wei Shao, Kaveh Hassani, Ning Yao, and Julian McAuley. 2025. Generating Long Semantic IDs in Parallel for Recommendation.arXiv preprint arXiv:2506.05781 (2025)

  14. [14]

    Yupeng Hou, Shanlei Mu, Wayne Xin Zhao, Yaliang Li, Bolin Ding, and Ji-Rong Wen. 2022. Towards universal sequence representation learning for recommender systems. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 585–593

  15. [15]

    Yupeng Hou, An Zhang, Leheng Sheng, Zhengyi Yang, Xiang Wang, Tat-Seng Chua, and Julian McAuley. 2025. Generative Recommendation Models: Progress and Directions. InCompanion Proceedings of the ACM on Web Conference 2025. 13–16

  16. [16]

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models.arXiv preprint arXiv:2106.09685(2021)

  17. [17]

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. 2022. Lora: Low-rank adaptation of large language models.ICLR1, 2 (2022), 3

  18. [18]

    Wenyue Hua, Shuyuan Xu, Yingqiang Ge, and Yongfeng Zhang. 2023. How to index item ids for recommendation foundation models. InProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 195–204

  19. [19]

    Robert A Jacobs, Michael I Jordan, Steven J Nowlan, and Geoffrey E Hinton. 1991. Adaptive mixtures of local experts.Neural computation3, 1 (1991), 79–87

  20. [20]

    Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recom- mendation. In2018 IEEE international conference on data mining (ICDM). IEEE, 197–206

  21. [21]

    Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114(2013)

  22. [22]

    Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, and Wook-Shin Han. 2022. Autoregressive image generation using residual quantization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11523–11532

  23. [23]

    Jianghao Lin, Bo Chen, Hangyu Wang, Yunjia Xi, Yanru Qu, Xinyi Dai, Kangning Zhang, Ruiming Tang, Yong Yu, and Weinan Zhang. 2024. Clickprompt: CTR models are strong prompt generators for adapting language models to CTR prediction. InProceedings of the ACM Web Conference 2024. 3319–3330

  24. [24]

    Juexin Lin, Sachin Yadav, Feng Liu, Nicholas Rossi, Praveen R Suram, Satya Chem- bolu, Prijith Chandran, Hrushikesh Mohapatra, Tony Lee, Alessandro Magnani, et al. 2024. Enhancing Relevance of Embedding-based Retrieval at Walmart. In Proceedings of the 33rd ACM International Conference on Information and Knowl- edge Management. 4694–4701

  25. [25]

    Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Cheng- gang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. 2024. Deepseek-v3 technical report.arXiv preprint arXiv:2412.19437(2024)

  26. [26]

    Junling Liu, Chao Liu, Peilin Zhou, Renjie Lv, Kang Zhou, and Yan Zhang

  27. [27]

    Is chatgpt a good recommender? a preliminary study.arXiv preprint arXiv:2304.10149(2023)

  28. [28]

    Qijiong Liu, Jieming Zhu, Quanyu Dai, and Xiao-Ming Wu. 2022. Boosting deep CTR prediction with a plug-and-play pre-trainer for news recommendation. In Proceedings of the 29th International Conference on Computational Linguistics. 2823–2833

  29. [29]

    Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101(2017). LWGR: Lagrangian-Constrained Personalized World Knowledge for Generative Recommendation Conference’17, July 2017, Washington, DC, USA

  30. [30]

    Lingyu Mu, Zhengxiao Liu, Zhitong Zhu, and Zheng Lin. 2025. Trust-GRS: A Trustworthy Training Framework for Graph Neural Network Based Recom- mender Systems Against Shilling Attacks. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 12408–12416

  31. [31]

    Aashiq Muhamed, Iman Keivanloo, Sujan Perera, James Mracek, Yi Xu, Qingjun Cui, Santosh Rajagopalan, Belinda Zeng, and Trishul Chilimbi. 2021. CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models. In NeurIPS Efficient Natural Language and Speech Processing Workshop

  32. [32]

    Nikil Pancha, Andrew Zhai, Jure Leskovec, and Charles Rosenberg. 2022. Pinner- former: Sequence modeling for user representation at pinterest. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 3702–3712

  33. [33]

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems32 (2019)

  34. [34]

    Guanghui Qin and Jason Eisner. 2021. Learning how to ask: Querying LMs with mixtures of soft prompts.arXiv preprint arXiv:2104.06599(2021)

  35. [37]

    Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al

  36. [38]

    Recommender systems with generative retrieval.Advances in Neural Information Processing Systems36 (2023), 10299–10315

  37. [39]

    Yankun Ren, Zhongde Chen, Xinxing Yang, Longfei Li, Cong Jiang, Lei Cheng, Bo Zhang, Linjian Mo, and Jun Zhou. 2024. Enhancing sequential recommenders with augmented knowledge from aligned large language models. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 345–354

  38. [40]

    Fengyuan Shi, Zhuoyan Luo, Yixiao Ge, Yujiu Yang, Ying Shan, and Limin Wang

  39. [41]

    In Proceedings of the IEEE/CVF International Conference on Computer Vision

    Scalable image tokenization with index backpropagation quantization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 16037– 16046

  40. [42]

    Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971(2023)

  41. [43]

    Aaron Van Den Oord, Oriol Vinyals, et al. 2017. Neural discrete representation learning.Advances in neural information processing systems30 (2017)

  42. [44]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.Advances in neural information processing systems30 (2017)

  43. [45]

    Hanbing Wang, Xiaorui Liu, Wenqi Fan, Xiangyu Zhao, Venkataramana Kini, Devendra Yadav, Fei Wang, Zhen Wen, Jiliang Tang, and Hui Liu. 2024. Rethinking large language model architectures for sequential recommendations.arXiv preprint arXiv:2402.09543(2024)

  44. [46]

    Shoujin Wang, Longbing Cao, Yan Wang, Quan Z Sheng, Mehmet A Orgun, and Defu Lian. 2021. A survey on session-based recommender systems.ACM Computing Surveys (CSUR)54, 7 (2021), 1–38

  45. [47]

    Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, and Tat-Seng Chua. 2023. Generative recommendation: Towards next-generation recommender paradigm. arXiv preprint arXiv:2304.03516(2023)

  46. [48]

    Xu Wang, Jiangxia Cao, Zhiyi Fu, Kun Gai, and Guorui Zhou. 2025. Home: Hierarchy of multi-gate experts for multi-task learning at kuaishou. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

  47. [49]

    Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2021. Empowering news recommendation with pre-trained language models. InProceedings of the 44th international ACM SIGIR conference on research and development in informa- tion retrieval. 1652–1656

  48. [50]

    Likang Wu, Zhi Zheng, Zhaopeng Qiu, Hao Wang, Hongchao Gu, Tingjia Shen, Chuan Qin, Chen Zhu, Hengshu Zhu, Qi Liu, et al . 2024. A survey on large language models for recommendation.World Wide Web27, 5 (2024), 60

  49. [51]

    Yunjia Xi, Weiwen Liu, Jianghao Lin, Xiaoling Cai, Hong Zhu, Jieming Zhu, Bo Chen, Ruiming Tang, Weinan Zhang, and Yong Yu. 2024. Towards open-world recommendation with knowledge augmentation from large language models. In Proceedings of the 18th ACM Conference on Recommender Systems. 12–22

  50. [52]

    Haibo Xing, Hao Deng, Yucheng Mao, Jinxin Hu, Yi Xu, Hao Zhang, Jiahao Wang, Shizhun Wang, Yu Zhang, Xiaoyi Zeng, et al. 2025. Reg4rec: Reasoning- enhanced generative model for large-scale recommendation systems.arXiv preprint arXiv:2508.15308(2025)

  51. [53]

    Haibo Xing, Kanefumi Matsuyama, Hao Deng, Jinxin Hu, Yu Zhang, and Xiaoyi Zeng. 2025. ESANS: Effective and Semantic-Aware Negative Sampling for Large- Scale Retrieval Systems. InProceedings of the ACM on Web Conference 2025. 462–471

  52. [54]

    Lanling Xu, Junjie Zhang, Bingqian Li, Jinpeng Wang, Sheng Chen, Wayne Xin Zhao, and Ji-Rong Wen. 2025. Tapping the potential of large language models as recommender systems: A comprehensive framework and empirical analysis. ACM Transactions on Knowledge Discovery from Data19, 5 (2025), 1–51

  53. [55]

    Liu Yang, Fabian Paischer, Kaveh Hassani, Jiacheng Li, Shuai Shao, Zhang Gabriel Li, Yun He, Xue Feng, Nima Noorshams, Sem Park, et al . 2024. Unifying gen- erative and dense retrieval for sequential recommendation.arXiv preprint arXiv:2411.18814(2024)

  54. [56]

    Yuhao Yang, Zhi Ji, Zhaopeng Li, Yi Li, Zhonglin Mo, Yue Ding, Kai Chen, Zijian Zhang, Jie Li, Shuanglong Li, et al. 2025. Sparse meets dense: Unified generative recommendations with cascaded sparse-dense representations.arXiv preprint arXiv:2503.02453(2025)

  55. [57]

    Jifan Yu, Xiaozhi Wang, Shangqing Tu, Shulin Cao, Daniel Zhang-Li, Xin Lv, Hao Peng, Zijun Yao, Xiaohan Zhang, Hanming Li, et al . 2023. Kola: Care- fully benchmarking world knowledge of large language models.arXiv preprint arXiv:2306.09296(2023)

  56. [58]

    Zheng Yuan, Fajie Yuan, Yu Song, Youhua Li, Junchen Fu, Fei Yang, Yunzhu Pan, and Yongxin Ni. 2023. Where to go next for recommender systems? id- vs. modality-based recommender models revisited. InProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2639–2649

  57. [59]

    Song Zhang, Nan Zheng, and Danli Wang. 2022. GBERT: Pre-training user representations for ephemeral group recommendation. InProceedings of the 31st ACM international conference on information & knowledge management. 2631– 2639

  58. [60]

    Yang Zhang, Juntao You, Yimeng Bai, Jizhi Zhang, Keqin Bao, Wenjie Wang, and Tat-Seng Chua. 2024. Causality-enhanced behavior sequence modeling in LLMs for personalized recommendation.arXiv preprint arXiv:2410.22809(2024)

  59. [61]

    Zihuai Zhao, Wenqi Fan, Jiatong Li, Yunqing Liu, Xiaowei Mei, Yiqi Wang, Zhen Wen, Fei Wang, Xiangyu Zhao, Jiliang Tang, et al. 2024. Recommender systems in the era of large language models (llms).IEEE Transactions on Knowledge and Data Engineering36, 11 (2024), 6889–6907

  60. [62]

    Guorui Zhou, Jiaxin Deng, Jinghao Zhang, Kuo Cai, Lejian Ren, Qiang Luo, Qian- qian Wang, Qigen Hu, Rui Huang, Shiyao Wang, et al. 2025. OneRec Technical Report.arXiv preprint arXiv:2506.13695(2025)