EvoRec: Self Evolving Agentic Recommender Systems

Haibo Xing; Hao Deng; Jinxin Hu; Lingyu Mu; Xiaoyi Zeng; Yu Zhang

arxiv: 2606.28368 · v1 · pith:UUMNXP77new · submitted 2026-06-15 · 💻 cs.IR

EvoRec: Self Evolving Agentic Recommender Systems

Lingyu Mu , Hao Deng , Haibo Xing , Jinxin Hu , Yu Zhang , Xiaoyi Zeng This is my paper

Pith reviewed 2026-06-30 10:45 UTC · model grok-4.3

classification 💻 cs.IR

keywords recommender systemsmulti-agent systemsLLM agentsself-evolving optimizationrecommendation methodologyindustrial A/B testing

0 comments

The pith

EvoRec uses a Skill Evolver to co-evolve both recommender models and the optimization methodology that drives them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Modern recommender systems depend on slow manual iteration by engineers. LLM agents can translate code but typically accumulate no lasting methodology and stay inside a fixed optimization range. EvoRec runs a dual-track loop in which Research and Code Agents update the model each round while a Skill Evolver periodically extracts reusable optimization methods from a persistent Memory of earlier trials. The result is an expanding set of structural improvements rather than repeated search inside the same bounds.

Core claim

EvoRec shows that a multi-agent system can co-evolve the recommendation model and the optimization methodology by letting the Skill Evolver distill reusable methodology from a persistent Memory of past experiments, thereby generating ideas outside any predefined range.

What carries the argument

The Skill Evolver, which periodically distills reusable methodology from the persistent Memory of past experiments to expand the space of future model updates.

If this is right

Offline metrics rise by up to 5.54 percent over the strongest baseline on two public benchmarks and one industrial dataset.
An online A/B test records a 1.85 percent revenue increase and a 1.02 percent CTR gain.
The optimization process moves from repeated search inside a preset space to the generation of structurally new approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same distillation loop could be tested on other automated design tasks such as neural architecture search or hyperparameter tuning pipelines.
The accumulated Memory and distilled skills might serve as a transferable asset when the same system is applied to a different recommendation domain.
One could measure whether the rate of new idea generation slows after many iterations or continues to grow with larger Memory stores.

Load-bearing premise

The Skill Evolver can reliably turn records of past experiments into reusable optimization ideas that lie outside the initial search range.

What would settle it

Running the full EvoRec loop on a held-out dataset produces no optimization ideas outside the starting range and yields no measurable lift over a fixed-range agent baseline.

Figures

Figures reproduced from arXiv: 2606.28368 by Haibo Xing, Hao Deng, Jinxin Hu, Lingyu Mu, Xiaoyi Zeng, Yu Zhang.

**Figure 1.** Figure 1: The overview of EvoRec. Four collaborating agents drive dual-track self-evolution: the Research Agent and Code [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

read the original abstract

Optimizing modern recommender systems still relies heavily on engineers iterating by hand, which is slow and bounded by individual expertise. LLM-based agents open a path toward automating this loop, yet two issues remain. First, the agent is used only as a code translator and accumulates no methodology across iterations. Second, the optimization space is confined to a predefined range and rarely introduces structurally new ideas. To address these problems, we propose EvoRec, a multi-agent framework that co-evolves the recommendation model and the optimization methodology driving it. Four collaborating agents carry out a dual-track loop: the Research Agent and Code Agent iterate the model each round, while the Skill Evolver periodically distills reusable methodology from a persistent Memory of past experiments. Experiments on two public benchmarks and one large-scale industrial dataset show that EvoRec improves offline metrics by up to 5.54% over the strongest baseline, and an online A/B test delivers a 1.85% revenue lift and a 1.02% CTR gain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EvoRec adds a Skill Evolver to let agents distill reusable methodology from memory, but the reported gains lack ablations that tie them to that component rather than extra iteration.

read the letter

The paper's core move is a four-agent loop where Research and Code agents update the model each round while a separate Skill Evolver pulls reusable tactics out of a persistent Memory store and feeds them back. That dual-track setup directly targets the two limits stated in the abstract: agents that only translate code and an optimization space stuck inside a preset range.

What stands out is the attempt to make the methodology itself evolve instead of treating it as fixed. The online A/B numbers (1.85% revenue, 1.02% CTR) and the 5.54% offline lift on public plus industrial data are concrete enough to notice.

The soft spot is exactly where the stress-test flagged: nothing shown isolates whether the Skill Evolver produces ideas outside the predefined range or whether the gains come from longer runs, better prompting, or simply having more agents. No concrete examples of distilled skills appear in the abstract, and no ablation keeps the Research/Code agents while removing the evolver. Without those, the headline claim that the co-evolution drives the lift rests on the architecture description rather than direct evidence.

The work is aimed at researchers already building agentic pipelines for recommendation or automated ML. A reader who wants to see whether the Memory-to-skill step actually escapes the usual iteration trap will find the framing useful, but will still need the missing controls.

I would send it to review. The idea is clear enough and the problem is real, but the referee will have to press on the causal link between the Skill Evolver and the measured gains.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes EvoRec, a multi-agent framework for self-evolving recommender systems. Four agents execute a dual-track co-evolution loop: the Research Agent and Code Agent iterate on the recommendation model each round, while the Skill Evolver periodically distills reusable methodology from a persistent Memory of past experiments. The central claim is that this addresses limitations of prior agentic systems (code translation only, predefined optimization ranges) and yields up to 5.54% offline metric gains on two public benchmarks plus one industrial dataset, plus 1.85% revenue lift and 1.02% CTR gain in an online A/B test.

Significance. If the experimental results hold and the Skill Evolver component is shown to produce structurally novel optimization ideas (rather than longer iteration or better prompting), the work could meaningfully advance automated optimization of recommender systems by enabling methodology accumulation across experiments.

major comments (2)

[Abstract] Abstract: the headline performance claims (up to 5.54% offline improvement, 1.85% revenue lift) are presented with no information on baselines, statistical tests, data splits, controls, or variance, which is load-bearing for evaluating whether the dual-track co-evolution is responsible for the gains.
[Abstract] Abstract (and implied § on Skill Evolver): the central attribution of gains to the Skill Evolver distilling reusable methodology from Memory lacks any concrete example of a distilled skill that is structurally new, any ablation removing the Skill Evolver while retaining Research/Code agents, or quantitative evidence isolating the Memory-to-skill pathway as the causal driver; without this the claim that EvoRec exceeds prior agentic systems' predefined-range limitation cannot be assessed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the abstract and the attribution of gains to the Skill Evolver. We address each point below and indicate where revisions will strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the headline performance claims (up to 5.54% offline improvement, 1.85% revenue lift) are presented with no information on baselines, statistical tests, data splits, controls, or variance, which is load-bearing for evaluating whether the dual-track co-evolution is responsible for the gains.

Authors: The abstract is a high-level summary constrained by length. Full details on baselines (strongest prior agentic and non-agentic recommenders), statistical tests (paired t-tests with p<0.05), data splits (standard temporal splits on public benchmarks plus industrial logs), controls, and variance (reported across 5 seeds) appear in Section 4 and the online A/B test subsection. To make the claims more self-contained, we will revise the abstract to briefly reference the strongest baseline and note statistical significance of the reported gains. revision: yes
Referee: [Abstract] Abstract (and implied § on Skill Evolver): the central attribution of gains to the Skill Evolver distilling reusable methodology from Memory lacks any concrete example of a distilled skill that is structurally new, any ablation removing the Skill Evolver while retaining Research/Code agents, or quantitative evidence isolating the Memory-to-skill pathway as the causal driver; without this the claim that EvoRec exceeds prior agentic systems' predefined-range limitation cannot be assessed.

Authors: The manuscript describes the Skill Evolver and Memory in Section 3.3 and provides qualitative examples of distilled skills in the appendix. However, the referee is correct that an explicit ablation isolating the Skill Evolver (while keeping Research/Code agents) and quantitative evidence specifically tracing gains to the Memory-to-skill pathway are not present. We will add both in revision: (1) an ablation table removing the Skill Evolver, and (2) concrete examples of structurally novel optimization ideas generated via the Memory pathway, with before/after performance deltas. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical claims

full rationale

The paper describes an agentic framework and reports measured performance lifts from experiments on two public benchmarks plus one industrial dataset, with no equations, derivations, or first-principles predictions that reduce to fitted parameters or self-definitions by construction. All load-bearing claims are presented as direct experimental outcomes rather than quantities forced by internal definitions or self-citation chains. The Skill Evolver component is described procedurally but its contribution is evaluated via overall system results, not via any circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities can be extracted because only the abstract is available.

pith-pipeline@v0.9.1-grok · 5713 in / 1156 out tokens · 34636 ms · 2026-06-30T10:45:16.842133+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 19 canonical work pages · 8 internal anchors

[1]

Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, et al. 2023. Qwen technical report.arXiv preprint arXiv:2309.16609(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

Yin Cheng, Liao Zhou, Xiyu Liang, Dihao Luo, Tewei Lee, Kailun Zheng, Wei- wei Zhang, Mingchen Cai, Jian Dong, and Andy Zhang. 2026. Let the Agent Steer: Closed-Loop Ranking Optimization via Influence Exchange.arXiv preprint arXiv:2603.27765(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[3]

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Ya- dav. 2025. Mem0: Building production-ready ai agents with scalable long-term memory.arXiv preprint arXiv:2504.19413(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[4]

Hao Deng, Haibo Xing, Kanefumi Matsuyama, Moyu Zhang, Jinxin Hu, Hong Wen, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. 2025. CSMF: Cascaded Selective Mask Fine-Tuning for Multi-Objective Embedding-Based Retrieval. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2122–2131

2025
[5]

Xuegang Hao, Ming Zhang, Alex Li, Xiangyu Qian, Zhi Ma, Yanlong Zang, Shijie Yang, Zhongxuan Han, Xiaolong Ma, Jinguang Liu, et al. 2025. OxygenREC: An Instruction-Following Generative Framework for E-commerce Recommendation. arXiv preprint arXiv:2512.22386(2025)

work page arXiv 2025
[6]

Ruining He and Julian McAuley. 2016. Ups and Downs: Modeling the Visual Evo- lution of Fashion Trends with One-Class Collaborative Filtering. InProceedings of the 25th International Conference on World Wide Web(Montréal, Québec, Canada) (WWW ’16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 507–517. doi...

work page doi:10.1145/2872427.2883037 2016
[7]

Xin He, Kaiyong Zhao, and Xiaowen Chu. 2021. AutoML: A survey of the state- of-the-art.Knowledge-based systems212 (2021), 106622

2021
[8]

Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recom- mendation. In2018 IEEE international conference on data mining (ICDM). IEEE, 197–206

2018
[9]

Fei Liu, Xinyu Lin, Hanchao Yu, Mingyuan Wu, Jianyu Wang, Qiang Zhang, Zhuokai Zhao, Yinglong Xia, Yao Zhang, Weiwei Li, et al. 2025. Recoworld: Build- ing simulated environments for agentic recommender systems.arXiv preprint arXiv:2509.10397(2025)

work page arXiv 2025
[10]

Qijiong Liu, Jieming Zhu, Quanyu Dai, and Xiao-Ming Wu. 2022. Boosting deep CTR prediction with a plug-and-play pre-trainer for news recommendation. In Proceedings of the 29th International Conference on Computational Linguistics. 2823–2833

2022
[11]

Ziyu Ma, Shidong Yang, Yuxiang Ji, Xucong Wang, Yong Wang, Yiming Hu, Tong- wen Huang, and Xiangxiang Chu. 2026. Skillclaw: Let skills evolve collectively with agentic evolver.arXiv preprint arXiv:2604.08377(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[12]

2026Reg4rec: Reasoning-enhanced generative model for large-scale recommendation systems

Lingyu Mu, Hao Deng, Haibo Xing, Jinxin Hu, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. 2026Reg4rec: Reasoning-enhanced generative model for large-scale recommendation systems. Masked Diffusion Generative Recommendation.arXiv preprint arXiv:2601.19501(2026Reg4rec: Reasoning-enhanced generative model for large-scale recommendation systems)

work page arXiv
[13]

Lingyu Mu, Zhengxiao Liu, Zhitong Zhu, and Zheng Lin. 2025. Trust-GRS: A Trustworthy Training Framework for Graph Neural Network Based Recom- mender Systems Against Shilling Attacks. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 12408–12416

2025
[14]

Aashiq Muhamed, Iman Keivanloo, Sujan Perera, James Mracek, Yi Xu, Qingjun Cui, Santosh Rajagopalan, Belinda Zeng, and Trishul Chilimbi. 2021. CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models. In NeurIPS Efficient Natural Language and Speech Processing Workshop

2021
[15]

Jingwei Ni, Yihao Liu, Xinpeng Liu, Yutao Sun, Mengyu Zhou, Pengyu Cheng, Dexin Wang, Erchao Zhao, Xiaoxi Jiang, and Guanjun Jiang. 2026. Trace2skill: Distill trajectory-local lessons into transferable agent skills.arXiv preprint arXiv:2603.25158(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

Kesha Ou, Chenghao Wu, Xiaolei Wang, Bowen Zheng, Wayne Xin Zhao, Weitao Li, Long Zhang, Sheng Chen, and Ji-Rong Wen. 2026. Deep Research for Recom- mender Systems.arXiv preprint arXiv:2603.07605(2026)

work page arXiv 2026
[17]

Charles Packer, Vivian Fang, Shishir_G Patil, Kevin Lin, Sarah Wooders, and Joseph_E Gonzalez. 2023. MemGPT: towards LLMs as operating systems. (2023)

2023
[18]

Nikil Pancha, Andrew Zhai, Jure Leskovec, and Charles Rosenberg. 2022. Pinner- former: Sequence modeling for user representation at pinterest. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 3702–3712

2022
[19]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems32 (2019)

2019
[20]

Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al
[21]

Recommender systems with generative retrieval.Advances in Neural Information Processing Systems36 (2023), 10299–10315

2023
[22]

Jiakai Tang, Yujie Luo, Xunke Xi, Fei Sun, Xueyang Feng, Sunhao Dai, Chao Yi, Dian Chen, Zhujin Gao, Yang Li, et al. 2025. Interactive Recommendation Agent with Active User Commands.arXiv preprint arXiv:2509.21317(2025)

work page arXiv 2025
[23]

Hanbing Wang, Xiaorui Liu, Wenqi Fan, Xiangyu Zhao, Venkataramana Kini, Devendra Yadav, Fei Wang, Zhen Wen, Jiliang Tang, and Hui Liu. 2024. Rethinking large language model architectures for sequential recommendations.arXiv preprint arXiv:2402.09543(2024)

work page arXiv 2024
[24]

Haochen Wang, Yi Wu, Daryl Chang, Li Wei, and Lukasz Heldt. 2026. Self- evolving recommendation system: End-to-end autonomous model optimization with LLM agents.arXiv preprint arXiv:2602.10226(2026)

work page arXiv 2026
[25]

Shoujin Wang, Longbing Cao, Yan Wang, Quan Z Sheng, Mehmet A Orgun, and Defu Lian. 2021. A survey on session-based recommender systems.ACM Computing Surveys (CSUR)54, 7 (2021), 1–38

2021
[26]

Bin Wu, Xiaowen Yin, Xun Su, and Mingliang Xu. 2026. Modeling Multi-Grained User Interests for Sequential Recommendation.IEEE Transactions on Computa- tional Social Systems(2026)

2026
[27]

Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2021. Empowering news recommendation with pre-trained language models. InProceedings of the 44th international ACM SIGIR conference on research and development in informa- tion retrieval. 1652–1656

2021
[29]

Haibo Xing, Hao Deng, Yucheng Mao, Jinxin Hu, Yi Xu, Hao Zhang, Jiahao Wang, Shizhun Wang, Yu Zhang, Xiaoyi Zeng, et al. 2025. Reg4rec: Reasoning- enhanced generative model for large-scale recommendation systems.arXiv preprint arXiv:2508.15308(2025)

work page arXiv 2025
[30]

Renjun Xu and Yang Yan. 2026. Agent skills for large language models: Architec- ture, acquisition, security, and the path forward.arXiv preprint arXiv:2602.12430 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[31]

Yuhao Yang, Zhi Ji, Zhaopeng Li, Yi Li, Zhonglin Mo, Yue Ding, Kai Chen, Zijian Zhang, Jie Li, Shuanglong Li, et al. 2025. Sparse meets dense: Unified generative recommendations with cascaded sparse-dense representations.arXiv preprint arXiv:2503.02453(2025)

work page arXiv 2025
[32]

Chao Yi, Dian Chen, Gaoyang Guo, Jiakai Tang, Jian Wu, Jing Yu, Mao Zhang, Wen Chen, Wenjun Yang, Yujie Luo, et al. 2025. RecGPT-V2 Technical Report. arXiv preprint arXiv:2512.14503(2025). Conference’17, July 2017, Washington, DC, USA Lingyu Mu, Hao Deng, Haibo Xing, Jinxin Hu, Yu Zhang, and Xiaoyi Zeng

work page arXiv 2025
[33]

Zheng Yuan, Fajie Yuan, Yu Song, Youhua Li, Junchen Fu, Fei Yang, Yunzhu Pan, and Yongxin Ni. 2023. Where to go next for recommender systems? id- vs. modality-based recommender models revisited. InProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2639–2649

2023
[34]

Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhao- jie Gong, Fangda Gu, Michael He, et al. 2024. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations.arXiv preprint arXiv:2402.17152(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[35]

Hanrong Zhang, Shicheng Fan, Henry Peng Zou, Yankai Chen, Zhenting Wang, Jiayu Zhou, Chengze Li, Wei-Chieh Huang, Yifei Yao, Kening Zheng, et al. 2026. Coevoskills: Self-evolving agent skills via co-evolutionary verification.arXiv preprint arXiv:2604.01687(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[36]

Song Zhang, Nan Zheng, and Danli Wang. 2022. GBERT: Pre-training user representations for ephemeral group recommendation. InProceedings of the 31st ACM international conference on information & knowledge management. 2631– 2639

2022
[37]

Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, and Ji-Rong Wen. 2020. S3-rec: Self-supervised learning for se- quential recommendation with mutual information maximization. InProceedings of the 29th ACM international conference on information & knowledge management. 1893–1902

2020

[1] [1]

Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, et al. 2023. Qwen technical report.arXiv preprint arXiv:2309.16609(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

Yin Cheng, Liao Zhou, Xiyu Liang, Dihao Luo, Tewei Lee, Kailun Zheng, Wei- wei Zhang, Mingchen Cai, Jian Dong, and Andy Zhang. 2026. Let the Agent Steer: Closed-Loop Ranking Optimization via Influence Exchange.arXiv preprint arXiv:2603.27765(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[3] [3]

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Ya- dav. 2025. Mem0: Building production-ready ai agents with scalable long-term memory.arXiv preprint arXiv:2504.19413(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[4] [4]

Hao Deng, Haibo Xing, Kanefumi Matsuyama, Moyu Zhang, Jinxin Hu, Hong Wen, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. 2025. CSMF: Cascaded Selective Mask Fine-Tuning for Multi-Objective Embedding-Based Retrieval. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2122–2131

2025

[5] [5]

Xuegang Hao, Ming Zhang, Alex Li, Xiangyu Qian, Zhi Ma, Yanlong Zang, Shijie Yang, Zhongxuan Han, Xiaolong Ma, Jinguang Liu, et al. 2025. OxygenREC: An Instruction-Following Generative Framework for E-commerce Recommendation. arXiv preprint arXiv:2512.22386(2025)

work page arXiv 2025

[6] [6]

Ruining He and Julian McAuley. 2016. Ups and Downs: Modeling the Visual Evo- lution of Fashion Trends with One-Class Collaborative Filtering. InProceedings of the 25th International Conference on World Wide Web(Montréal, Québec, Canada) (WWW ’16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 507–517. doi...

work page doi:10.1145/2872427.2883037 2016

[7] [7]

Xin He, Kaiyong Zhao, and Xiaowen Chu. 2021. AutoML: A survey of the state- of-the-art.Knowledge-based systems212 (2021), 106622

2021

[8] [8]

Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recom- mendation. In2018 IEEE international conference on data mining (ICDM). IEEE, 197–206

2018

[9] [9]

Fei Liu, Xinyu Lin, Hanchao Yu, Mingyuan Wu, Jianyu Wang, Qiang Zhang, Zhuokai Zhao, Yinglong Xia, Yao Zhang, Weiwei Li, et al. 2025. Recoworld: Build- ing simulated environments for agentic recommender systems.arXiv preprint arXiv:2509.10397(2025)

work page arXiv 2025

[10] [10]

Qijiong Liu, Jieming Zhu, Quanyu Dai, and Xiao-Ming Wu. 2022. Boosting deep CTR prediction with a plug-and-play pre-trainer for news recommendation. In Proceedings of the 29th International Conference on Computational Linguistics. 2823–2833

2022

[11] [11]

Ziyu Ma, Shidong Yang, Yuxiang Ji, Xucong Wang, Yong Wang, Yiming Hu, Tong- wen Huang, and Xiangxiang Chu. 2026. Skillclaw: Let skills evolve collectively with agentic evolver.arXiv preprint arXiv:2604.08377(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[12] [12]

2026Reg4rec: Reasoning-enhanced generative model for large-scale recommendation systems

Lingyu Mu, Hao Deng, Haibo Xing, Jinxin Hu, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. 2026Reg4rec: Reasoning-enhanced generative model for large-scale recommendation systems. Masked Diffusion Generative Recommendation.arXiv preprint arXiv:2601.19501(2026Reg4rec: Reasoning-enhanced generative model for large-scale recommendation systems)

work page arXiv

[13] [13]

Lingyu Mu, Zhengxiao Liu, Zhitong Zhu, and Zheng Lin. 2025. Trust-GRS: A Trustworthy Training Framework for Graph Neural Network Based Recom- mender Systems Against Shilling Attacks. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 12408–12416

2025

[14] [14]

Aashiq Muhamed, Iman Keivanloo, Sujan Perera, James Mracek, Yi Xu, Qingjun Cui, Santosh Rajagopalan, Belinda Zeng, and Trishul Chilimbi. 2021. CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models. In NeurIPS Efficient Natural Language and Speech Processing Workshop

2021

[15] [15]

Jingwei Ni, Yihao Liu, Xinpeng Liu, Yutao Sun, Mengyu Zhou, Pengyu Cheng, Dexin Wang, Erchao Zhao, Xiaoxi Jiang, and Guanjun Jiang. 2026. Trace2skill: Distill trajectory-local lessons into transferable agent skills.arXiv preprint arXiv:2603.25158(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[16] [16]

Kesha Ou, Chenghao Wu, Xiaolei Wang, Bowen Zheng, Wayne Xin Zhao, Weitao Li, Long Zhang, Sheng Chen, and Ji-Rong Wen. 2026. Deep Research for Recom- mender Systems.arXiv preprint arXiv:2603.07605(2026)

work page arXiv 2026

[17] [17]

Charles Packer, Vivian Fang, Shishir_G Patil, Kevin Lin, Sarah Wooders, and Joseph_E Gonzalez. 2023. MemGPT: towards LLMs as operating systems. (2023)

2023

[18] [18]

Nikil Pancha, Andrew Zhai, Jure Leskovec, and Charles Rosenberg. 2022. Pinner- former: Sequence modeling for user representation at pinterest. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 3702–3712

2022

[19] [19]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems32 (2019)

2019

[20] [20]

Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al

[21] [21]

Recommender systems with generative retrieval.Advances in Neural Information Processing Systems36 (2023), 10299–10315

2023

[22] [22]

Jiakai Tang, Yujie Luo, Xunke Xi, Fei Sun, Xueyang Feng, Sunhao Dai, Chao Yi, Dian Chen, Zhujin Gao, Yang Li, et al. 2025. Interactive Recommendation Agent with Active User Commands.arXiv preprint arXiv:2509.21317(2025)

work page arXiv 2025

[23] [23]

Hanbing Wang, Xiaorui Liu, Wenqi Fan, Xiangyu Zhao, Venkataramana Kini, Devendra Yadav, Fei Wang, Zhen Wen, Jiliang Tang, and Hui Liu. 2024. Rethinking large language model architectures for sequential recommendations.arXiv preprint arXiv:2402.09543(2024)

work page arXiv 2024

[24] [24]

Haochen Wang, Yi Wu, Daryl Chang, Li Wei, and Lukasz Heldt. 2026. Self- evolving recommendation system: End-to-end autonomous model optimization with LLM agents.arXiv preprint arXiv:2602.10226(2026)

work page arXiv 2026

[25] [25]

Shoujin Wang, Longbing Cao, Yan Wang, Quan Z Sheng, Mehmet A Orgun, and Defu Lian. 2021. A survey on session-based recommender systems.ACM Computing Surveys (CSUR)54, 7 (2021), 1–38

2021

[26] [26]

Bin Wu, Xiaowen Yin, Xun Su, and Mingliang Xu. 2026. Modeling Multi-Grained User Interests for Sequential Recommendation.IEEE Transactions on Computa- tional Social Systems(2026)

2026

[27] [27]

Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2021. Empowering news recommendation with pre-trained language models. InProceedings of the 44th international ACM SIGIR conference on research and development in informa- tion retrieval. 1652–1656

2021

[28] [29]

Haibo Xing, Hao Deng, Yucheng Mao, Jinxin Hu, Yi Xu, Hao Zhang, Jiahao Wang, Shizhun Wang, Yu Zhang, Xiaoyi Zeng, et al. 2025. Reg4rec: Reasoning- enhanced generative model for large-scale recommendation systems.arXiv preprint arXiv:2508.15308(2025)

work page arXiv 2025

[29] [30]

Renjun Xu and Yang Yan. 2026. Agent skills for large language models: Architec- ture, acquisition, security, and the path forward.arXiv preprint arXiv:2602.12430 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[30] [31]

Yuhao Yang, Zhi Ji, Zhaopeng Li, Yi Li, Zhonglin Mo, Yue Ding, Kai Chen, Zijian Zhang, Jie Li, Shuanglong Li, et al. 2025. Sparse meets dense: Unified generative recommendations with cascaded sparse-dense representations.arXiv preprint arXiv:2503.02453(2025)

work page arXiv 2025

[31] [32]

Chao Yi, Dian Chen, Gaoyang Guo, Jiakai Tang, Jian Wu, Jing Yu, Mao Zhang, Wen Chen, Wenjun Yang, Yujie Luo, et al. 2025. RecGPT-V2 Technical Report. arXiv preprint arXiv:2512.14503(2025). Conference’17, July 2017, Washington, DC, USA Lingyu Mu, Hao Deng, Haibo Xing, Jinxin Hu, Yu Zhang, and Xiaoyi Zeng

work page arXiv 2025

[32] [33]

Zheng Yuan, Fajie Yuan, Yu Song, Youhua Li, Junchen Fu, Fei Yang, Yunzhu Pan, and Yongxin Ni. 2023. Where to go next for recommender systems? id- vs. modality-based recommender models revisited. InProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2639–2649

2023

[33] [34]

Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhao- jie Gong, Fangda Gu, Michael He, et al. 2024. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations.arXiv preprint arXiv:2402.17152(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[34] [35]

Hanrong Zhang, Shicheng Fan, Henry Peng Zou, Yankai Chen, Zhenting Wang, Jiayu Zhou, Chengze Li, Wei-Chieh Huang, Yifei Yao, Kening Zheng, et al. 2026. Coevoskills: Self-evolving agent skills via co-evolutionary verification.arXiv preprint arXiv:2604.01687(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[35] [36]

Song Zhang, Nan Zheng, and Danli Wang. 2022. GBERT: Pre-training user representations for ephemeral group recommendation. InProceedings of the 31st ACM international conference on information & knowledge management. 2631– 2639

2022

[36] [37]

Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, and Ji-Rong Wen. 2020. S3-rec: Self-supervised learning for se- quential recommendation with mutual information maximization. InProceedings of the 29th ACM international conference on information & knowledge management. 1893–1902

2020