Mirroring Users: Towards Building Preference-aligned User Simulator with User Feedback in Recommendation
Pith reviewed 2026-05-18 21:24 UTC · model grok-4.3
The pith
A two-phase framework generates rationales from user feedback and distills informative samples to fine-tune LLMs as preference-aligned simulators for recommender systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework constructs high-quality simulation data in two phases: LLMs generate decision-making processes as explanatory rationales on simulation samples to reduce ambiguity, after which data distillation based on uncertainty estimation and behavior sampling filters the most informative and denoised samples. Fine-tuning lightweight LLMs on this dataset together with the corresponding decision-making processes significantly boosts alignment with human preferences and the in-domain reasoning capabilities of the simulators, yielding more insightful and interpretable signals for recommender system interaction.
What carries the argument
The data construction framework that uses LLM-generated explanatory rationales followed by uncertainty-based distillation to turn raw user feedback into high-quality training data for user simulators.
If this is right
- Fine-tuned simulators exhibit significantly improved alignment with human preferences.
- The simulators gain stronger in-domain reasoning capabilities.
- They deliver more insightful and interpretable signals for recommender system interactions.
- The framework efficiently manages ambiguity, noise, and volume in user feedback data.
Where Pith is reading between the lines
- The produced rationales could be surfaced directly to end users to explain why certain items are recommended.
- Similar rationale-plus-filtering pipelines might improve user simulation in other interactive systems such as conversational agents.
- Live deployment experiments could measure whether the better-aligned simulators lead to higher user satisfaction in actual recommender platforms.
Load-bearing premise
LLM-generated rationales and uncertainty-based filtering can reduce ambiguity and noise in user feedback without introducing new biases or losing key preference information.
What would settle it
A held-out test set of real user interactions where the fine-tuned simulators are asked to predict choices and rationales; higher agreement with actual human selections and rationales than baselines trained without the rationale or filtering steps would support the claim.
Figures
read the original abstract
User simulation is increasingly vital to develop and evaluate recommender systems (RSs). While Large Language Models (LLMs) offer promising avenues to simulate user behavior, they often struggle with the absence of specific task alignment required for RSs and the efficiency demands of large-scale simulation. A vast yet underutilized resource for enhancing this alignment is the extensive user feedback inherent in RSs, but leveraging it is challenging due to its ambiguity, noise and massive volume, which hinders efficient preference alignment. To overcome these hurdles, we introduce a novel data construction framework that leverages user feedback in RSs with advanced LLM capabilities to generate high-quality simulation data. Our framework unfolds in two key phases: (1) using LLMs to generate decision-making processes as explanatory rationales on simulation samples, thereby reducing ambiguity; and (2) data distillation based on uncertainty estimation and behavior sampling to efficiently filter the most informative, denoised samples. Accordingly, we fine-tune lightweight LLMs, as user simulators, using such high-quality dataset with corresponding decision-making processes. Extensive experiments confirm that our framework significantly boosts the alignment with human preferences and the in-domain reasoning capabilities of the fine-tuned LLMs, providing more insightful and interpretable signals for RS interaction. We believe our work, together with publicly available developed framework, high-quality mixed-domain dataset, and fine-tuned LLM checkpoints, will advance the RS community and offer valuable insights for broader human-centric AI research. Our code is available at https://github.com/Joinn99/UserMirrorer.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a two-phase data construction framework for building preference-aligned user simulators in recommender systems. Phase 1 prompts LLMs to generate explanatory rationales for simulation samples drawn from user feedback to reduce ambiguity and noise. Phase 2 applies uncertainty estimation and behavior sampling to distill informative samples. Lightweight LLMs are then fine-tuned on the resulting dataset (with rationales) to serve as simulators. The abstract states that extensive experiments confirm significant improvements in human preference alignment and in-domain reasoning capabilities, with public release of code, dataset, and checkpoints.
Significance. If the generated rationales faithfully recover latent user preferences rather than LLM priors, the framework could offer a practical method for leveraging large-scale, noisy RS feedback to create more aligned and interpretable simulators. The public artifacts strengthen potential impact for the RS and human-centric AI communities.
major comments (2)
- [Abstract and §3] Abstract and §3 (framework description): the central claim that LLM-generated rationales reduce ambiguity and improve alignment rests on the unverified assumption that these rationales surface actual user decision factors. No human validation, inter-annotator agreement, or rationale-only vs. feedback-only ablation is described, leaving open the risk that rationales inject model priors instead of recovering user preferences.
- [§4] §4 (experiments): the assertion of 'significant boosts' in alignment and reasoning is presented without reported quantitative results, specific metrics, baseline comparisons, or statistical tests in the provided abstract and summary. This makes independent verification of the load-bearing experimental support impossible from the manuscript details given.
minor comments (2)
- [§3.1] Clarify the exact prompting strategy and temperature settings used for rationale generation in Phase 1 to allow reproducibility.
- [§3.2] The uncertainty estimation method in Phase 2 should specify the exact formulation (e.g., entropy over what distribution) and any thresholds applied.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our work. We address each major point below and describe the changes we will make in the revised manuscript.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (framework description): the central claim that LLM-generated rationales reduce ambiguity and improve alignment rests on the unverified assumption that these rationales surface actual user decision factors. No human validation, inter-annotator agreement, or rationale-only vs. feedback-only ablation is described, leaving open the risk that rationales inject model priors instead of recovering user preferences.
Authors: We acknowledge the validity of this concern. The framework description in §3 motivates rationale generation as a means to reduce ambiguity in user feedback, but we agree that the claim would be strengthened by direct evidence that the rationales recover user preferences rather than LLM priors. In the revision we will add a human evaluation study in which multiple annotators rate the fidelity of generated rationales to the original feedback, report inter-annotator agreement, and include an ablation that compares simulator performance when trained on rationale-augmented data versus raw feedback only. revision: yes
-
Referee: [§4] §4 (experiments): the assertion of 'significant boosts' in alignment and reasoning is presented without reported quantitative results, specific metrics, baseline comparisons, or statistical tests in the provided abstract and summary. This makes independent verification of the load-bearing experimental support impossible from the manuscript details given.
Authors: We apologize that the excerpt supplied to the referee did not surface the quantitative details already present in §4. The full experimental section reports concrete metrics for preference alignment and reasoning quality, direct comparisons against several baselines, and statistical significance testing. To improve accessibility we will revise the abstract to include the key numerical results and add explicit pointers from the abstract to the corresponding tables and statistical analyses in §4. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper describes a two-phase empirical framework that ingests external user feedback from RSs, prompts LLMs to produce explanatory rationales, applies uncertainty-based filtering, and fine-tunes lightweight LLMs as simulators. All load-bearing steps rely on observable external data and standard LLM capabilities rather than self-definitional loops, fitted parameters renamed as predictions, or self-citation chains that substitute for independent justification. Claims of improved human-preference alignment are presented as experimental outcomes, not as mathematical identities derived from the inputs themselves. The approach is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our framework unfolds in two key phases: (1) using LLMs to generate decision-making processes as explanatory rationales on simulation samples... (2) data distillation based on uncertainty estimation and behavior sampling
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Through Their Eyes: Fixation-aligned Tuning for Personalized User Emulation
Personalized soft prompts steer VLM attention to match user-specific gaze patterns, yielding better attention alignment and click prediction in recommendation simulations.
Reference graph
Works this paper leans on
-
[1]
G. Adomavicius and A. Tuzhilin. 2005. Toward the next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering 17, 6 (June 2005), 734–749. https://doi.org/10.1109/TKDE.2005.99
-
[2]
Ellis, Brian Whitman, and Paul Lamere
Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. 2011. The Million Song Dataset. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011)
work page 2011
-
[3]
Shijie Chen, Bernal Jimenez Gutierrez, and Yu Su. 2025. Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers. In The Thirteenth International Conference on Learning Representations
work page 2025
-
[4]
Lanzendörfer, Flint Xiaofeng Fan, and Roger Wattenhofer
Nathan Corecco, Giorgio Piatti, Luca A. Lanzendörfer, Flint Xiaofeng Fan, and Roger Wattenhofer. 2024. SUBER: An RL Environment with Simulated Human Behavior for Recommender Systems. InProceedings of the 27th European Conference on Artificial Intelligence (ECAI 2024)
work page 2024
-
[5]
DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai D...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2501.12948 2025
-
[6]
Alex Deng, Jiannan Lu, and Jonthan Litz. 2017. Trustworthy Analysis of Online A/B Tests: Pitfalls, Challenges and Solutions. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM ’17). Association for Computing Machinery, New York, NY , USA, 641–649. https://doi.org/10.1145/3018661.3018677
-
[7]
Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou
-
[8]
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment. https://doi.org/10.48550/arXiv.2502.18965 arXiv:2502.18965 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2502.18965
-
[9]
Mukund Deshpande and George Karypis. 2004. Item-Based Top-N Recommendation Algorithms. ACM Trans. Inf. Syst. 22, 1 (Jan. 2004), 143–177. https://doi.org/10.1145/963770.963776
-
[10]
Yingpeng Du, Zhu Sun, Ziyan Wang, Haoyan Chua, Jie Zhang, and Yew-Soon Ong. 2025. Active Large Language Model-Based Knowledge Distillation for Session-Based Recommendation. Proceedings of the AAAI Conference on Artificial Intelligence 39, 11 (Apr. 2025), 11607–11615. https://doi.org/10. 1609/aaai.v39i11.33263
work page 2025
-
[11]
Yingpeng Du, Ziyan Wang, Zhu Sun, Yining Ma, Hongzhi Liu, and Jie Zhang. 2024. Disentangled Multi-interest Representation Learning for Sequential Recommendation. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24). Association for Computing Machinery, New York, NY , USA, 677–688. https://doi.org/10.1145/363752...
- [12]
-
[13]
J.F. Engel, R.D. Blackwell, and D.T. Kollat. 1978. Consumer Behavior. Dryden Press
work page 1978
-
[14]
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In Proceedings of The 33rd International Conference on Machine Learning. PMLR, 1050–1059
work page 2016
-
[15]
Chen Gao, Xiaochong Lan, Zhihong Lu, Jinzhu Mao, Jinghua Piao, Huandong Wang, Depeng Jin, and Yong Li. 2023. S3: Social-network Simulation System with Large Language Model-Empowered Agents. https://doi.org/10.48550/arXiv.2307.14984 arXiv:2307.14984
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2307.14984 2023
-
[16]
Chongming Gao, Shijun Li, Wenqiang Lei, Jiawei Chen, Biao Li, Peng Jiang, Xiangnan He, Jiaxin Mao, and Tat-Seng Chua. 2022. KuaiRec: A fully-observed dataset and insights for evaluating recommender systems. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management . 540–550
work page 2022
-
[17]
Alexandre Gilotte, Clément Calauzènes, Thomas Nedelec, Alexandre Abraham, and Simon Dollé. 2018. Offline A/B Testing for Recommender Systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (WSDM ’18). Association for Computing Machinery, New York, NY , USA, 198–206. https://doi.org/10.1145/3159652.3159687
-
[18]
F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis) 5, 4 (2015), 1–19
work page 2015
-
[19]
Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, YongDong Zhang, and Meng Wang. 2020. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’20). Association for Computing Machinery, New York, NY , USA, 639–648. h...
-
[20]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182
work page 2017
-
[21]
Bairu Hou, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang, and Yang Zhang. 2024. Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling. In Proceedings of the 41st International Conference on Machine Learning (ICML’24, Vol. 235). JMLR.org, Vienna, Austria, 19023–19042
work page 2024
-
[22]
Eugene Ie, Chih-wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, and Craig Boutilier. 2019. RecSim: A Configurable Simulation Platform for Recommender Systems. https: //doi.org/10.48550/arXiv.1909.04847 arXiv:1909.04847 [cs, stat]
-
[23]
Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kaijie Zhu, Yijia Xiao, and Jindong Wang. 2024. AgentReview: Exploring Peer Review Dynamics with LLM Agents. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for Computational Linguistics, Miami, Fl...
-
[24]
Daniel Kahneman. 2011. Thinking, Fast and Slow. Farrar, Straus and Giroux, New York, NY , US. 499 pages
work page 2011
-
[25]
Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. In 2018 IEEE International Conference on Data Mining (ICDM). 197–206. https://doi.org/10.1109/ICDM.2018. 00035
-
[26]
Miranda, Alisa Liu, Nouha Dziri, Shane Lyu, Yuling Gu, Saumya Malik, Victoria Graf, Jena D
Nathan Lambert, Jacob Morrison, Valentina Pyatkin, Shengyi Huang, Hamish Ivison, Faeze Brahman, Lester James V . Miranda, Alisa Liu, Nouha Dziri, Shane Lyu, Yuling Gu, Saumya Malik, Victoria Graf, Jena D. Hwang, Jiangjiang Yang, Ronan Le Bras, Oyvind Tafjord, Chris Wilhelm, Luca Soldaini, Noah A. Smith, Yizhong Wang, Pradeep Dasigi, and Hannaneh Hajishirz...
work page 2024
-
[27]
Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural Attentive Session-based Recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (Singapore, Singapore) (CIKM ’17). Association for Computing Machinery, New York, NY , USA, 1419–1428. https://doi.org/10.1145/3132847.3132926 12
-
[28]
Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, and Tianyi Zhou
-
[29]
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computational Linguistics, Bangkok, Thailand, 14255–14273. https://doi.org/10.18653/v1/2024.acl-long.769
-
[30]
Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, and Meishan Zhang. 2023. Towards General Text Embeddings with Multi-stage Contrastive Learning. https://doi.org/10.48550/arXiv. 2308.03281 arXiv:2308.03281 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv 2023
-
[31]
Dawen Liang, Rahul G Krishnan, Matthew D Hoffman, and Tony Jebara. 2018. Variational autoencoders for collaborative filtering. In Proceedings of the 2018 world wide web conference. 689–698
work page 2018
-
[32]
Xufang Luo, Zheng Liu, Shitao Xiao, Xing Xie, and Dongsheng Li. 2022. MINDSim: User Simulator for News Recommenders. In Proceedings of the ACM Web Conference 2022 (WWW ’22). Association for Computing Machinery, New York, NY , USA, 2067–2077. https://doi.org/10.1145/3485447. 3512080
-
[33]
Kelong Mao, Jieming Zhu, Jinpeng Wang, Quanyu Dai, Zhenhua Dong, Xi Xiao, and Xiuqiang He
-
[34]
In Proceedings of the 30th ACM international conference on information & knowledge management
SimpleX: A simple and strong baseline for collaborative filtering. In Proceedings of the 30th ACM international conference on information & knowledge management. 1243–1252
-
[35]
Muhammad Hasan Maqbool, Umar Farooq, Adib Mosharrof, AB Siddique, and Hassan Foroosh. 2023. MobileRec: A large scale dataset for mobile apps recommendation. InProceedings of the 46th international ACM SIGIR conference on research and development in information retrieval. 3007–3016
work page 2023
-
[36]
Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP- IJCNLP). 188–197
work page 2019
-
[37]
Mark O’Neill, Elham Vaziripour, Justin Wu, and Daniel Zappala. 2016. Condensing steam: Distilling the diversity of gamer behavior. In Proceedings of the 2016 internet measurement conference. 81–95
work page 2016
-
[38]
Fernando Benjamin Perez Maurera, Maurizio Ferrari Dacrema, Pablo Castells, and Paolo Cremonesi
-
[39]
Impression-Aware Recommender Systems. ACM Trans. Recomm. Syst. (Jan. 2025). https: //doi.org/10.1145/3712292
-
[40]
Qwen, An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Ti...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2412.15115 2025
-
[41]
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, and Chelsea Finn. 2024. Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https: //doi.org/10.48550/arXiv.2305.18290 arXiv:2305.18290 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2305.18290 2024
-
[42]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncer- tainty in Artificial Intelligence (UAI ’09). AUAI Press, Arlington, Virginia, USA, 452–461
work page 2009
-
[43]
Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, Y . K. Li, Y . Wu, and Daya Guo. 2024. DeepSeekMath: Pushing the Limits of Mathemat- ical Reasoning in Open Language Models. https://doi.org/10.48550/arXiv.2402.03300 arXiv:2402.03300 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2402.03300 2024
-
[44]
Yifei Shen, Yongji Wu, Yao Zhang, Caihua Shan, Jun Zhang, B Khaled Letaief, and Dongsheng Li. 2021. How powerful is graph convolution for recommendation?. In Proceedings of the 30th ACM international conference on information & knowledge management. 1619–1629
work page 2021
-
[45]
Elizaveta Stavinova, Alexander Grigorievskiy, Anna V olodkevich, Petr Chunaev, Klavdiya Bochenina, and Dmitry Bugaychenko. 2022. Synthetic Data-Based Simulators for Recommender Systems: A Survey. https://doi.org/10.48550/arXiv.2206.11338 arXiv:2206.11338 [cs] 13
-
[46]
Weiwei Sun, Lingyong Yan, Xinyu Ma, Shuaiqiang Wang, Pengjie Ren, Zhumin Chen, Dawei Yin, and Zhaochun Ren. 2023. Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for C...
-
[47]
Zhu Sun, Di Yu, Hui Fang, Jie Yang, Xinghua Qu, Jie Zhang, and Cong Geng. 2020. Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison. In Proceedings of the 14th ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys ’20). Association for Computing Machinery, New York, NY , USA, 23–32. htt...
-
[48]
Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, Johan Ferret, Peter Liu, Pouya Tafti, Abe Friesen, Michelle Casbon, Sabela Ramos, Ravin Kumar, Charline Le Lan, Sammy Jerome, Anton Tsitsulin, Nino Vieillard, Piotr Stanczyk, Sertan Girgin, ...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2408.00118 2024
-
[49]
Mengting Wan and Julian McAuley. 2018. Item recommendation on monotonic behavior chains. In Proceedings of the 12th ACM Conference on Recommender Systems (Vancouver, British Columbia, Canada) (RecSys ’18). Association for Computing Machinery, New York, NY , USA, 86–94. https: //doi.org/10.1145/3240323.3240369
-
[50]
Lei Wang, Jingsen Zhang, Hao Yang, Zhi-Yuan Chen, Jiakai Tang, Zeyu Zhang, Xu Chen, Yankai Lin, Hao Sun, Ruihua Song, Xin Zhao, Jun Xu, Zhicheng Dou, Jun Wang, and Ji-Rong Wen. 2025. User Behavior Simulation with Large Language Model-based Agents. ACM Trans. Inf. Syst.43, 2 (Jan. 2025), 55:1–55:37. https://doi.org/10.1145/3708985
-
[51]
Wenjie Wang, Yiyan Xu, Fuli Feng, Xinyu Lin, Xiangnan He, and Tat-Seng Chua. 2023. Diffusion Recommender Model. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23). Association for Computing Machinery, New York, NY , USA, 832–841. https://doi.org/10.1145/3539618.3591663
-
[52]
Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. Neural graph collaborative filtering. In Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval. 165–174
work page 2019
-
[53]
Yancheng Wang, Ziyan Jiang, Zheng Chen, Fan Yang, Yingxue Zhou, Eunah Cho, Xing Fan, Yanbin Lu, Xiaojiang Huang, and Yingzhen Yang. 2024. RecMind: Large Language Model Powered Agent For 14 Recommendation. In Findings of the Association for Computational Linguistics: NAACL 2024 , Kevin Duh, Helena Gomez, and Steven Bethard (Eds.). Association for Computati...
-
[54]
Yifan Wang, Weizhi Ma, Min Zhang, Yiqun Liu, and Shaoping Ma. 2023. A Survey on the Fairness of Recommender Systems. ACM Trans. Inf. Syst.41, 3 (Feb. 2023), 52:1–52:43. https://doi.org/10. 1145/3547333
work page 2023
-
[55]
Zhenduo Wang, Zhichao Xu, Vivek Srikumar, and Qingyao Ai. 2024. An In-depth Investigation of User Response Simulation for Conversational Search. In Proceedings of the ACM Web Conference 2024 (WWW ’24). Association for Computing Machinery, New York, NY , USA, 1407–1418. https://doi.org/10. 1145/3589334.3645447
-
[56]
Tianjun Wei, Tommy W. S. Chow, and Jianghong Ma. 2024. FPSR+: Toward Robust, Efficient, and Scalable Collaborative Filtering With Partition-Aware Item Similarity Modeling. IEEE Transactions on Knowledge and Data Engineering 36, 12 (Dec. 2024), 8283–8296. https://doi.org/10.1109/TKDE. 2024.3418080
-
[57]
Tianjun Wei, Tommy W. S. Chow, and Jianghong Ma. 2024. FPSR+: Toward Robust, Efficient, and Scalable Collaborative Filtering With Partition-Aware Item Similarity Modeling. IEEE Transactions on Knowledge and Data Engineering 36, 12 (2024), 8283–8296. https://doi.org/10.1109/TKDE.2024.3418080
-
[58]
Tianjun Wei, Jianghong Ma, and Tommy W. S. Chow. 2023. Fine-tuning Partition-aware Item Similarities for Efficient and Scalable Recommendation. In Proceedings of the ACM Web Conference 2023(Austin, TX, USA) (WWW ’23). Association for Computing Machinery, New York, NY , USA, 823–832. https: //doi.org/10.1145/3543507.3583240
-
[59]
Wei Wei, Quoc Le, Andrew Dai, and Jia Li. 2018. AirDialogue: An Environment for Goal-Oriented Dialogue Research. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii (Eds.). Association for Computational Linguistics, Brussels, Belgium, 3844–3854. https...
-
[60]
Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, and Ming Zhou. 2020. MIND: A Large-scale Dataset for News Recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.)....
-
[61]
Jiancan Wu, Xiang Wang, Fuli Feng, Xiangnan He, Liang Chen, Jianxun Lian, and Xing Xie. 2021. Self- Supervised Graph Learning for Recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21). Association for Computing Machinery, New York, NY , USA, 726–735. https://doi.org/1...
-
[62]
Xiwang Yang, Harald Steck, Yang Guo, and Yong Liu. 2012. On Top-k Recommendation Using Social Net- works. In Proceedings of the Sixth ACM Conference on Recommender Systems (RecSys ’12). Association for Computing Machinery, New York, NY , USA, 67–74. https://doi.org/10.1145/2365952.2365969
-
[63]
An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, and Tat-Seng Chua. 2024. On Generative Agents in Recommendation. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’24). Association for Computing Machinery, New York, NY , USA, 1807–1817. https://doi.org/10.1145/3626772.3657844
-
[64]
Erhan Zhang, Xingzhu Wang, Peiyuan Gong, Yankai Lin, and Jiaxin Mao. 2024. USimAgent: Large Language Models for Simulating Search Users. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’24). Association for Computing Machinery, New York, NY , USA, 2687–2692. https://doi.org/10.1145/...
-
[65]
Junjie Zhang, Yupeng Hou, Ruobing Xie, Wenqi Sun, Julian McAuley, Wayne Xin Zhao, Leyu Lin, and Ji- Rong Wen. 2024. AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems. In Proceedings of the ACM Web Conference 2024 (WWW ’24) . Association for Computing Machinery, New York, NY , USA, 3679–3689. https://doi.org/10.1145/35...
-
[66]
Zijian Zhang, Shuchang Liu, Ziru Liu, Rui Zhong, Qingpeng Cai, Xiangyu Zhao, Chunxu Zhang, Qidong Liu, and Peng Jiang. 2025. LLM-Powered User Simulator for Recommender System. In Proceedings of the Thirty-Four International Joint Conference on Artificial Intelligence (AAAI ’25). https://doi.org/ 10.48550/arXiv.2412.16984 arXiv:2412.16984 [cs] 15
-
[67]
Kesen Zhao, Shuchang Liu, Qingpeng Cai, Xiangyu Zhao, Ziru Liu, Dong Zheng, Peng Jiang, and Kun Gai. 2023. KuaiSim: A Comprehensive Simulator for Recommender Systems. In Proceedings of the 37th International Conference on Neural Information Processing Systems (NIPS ’23). Curran Associates Inc., Red Hook, NY , USA, 44880–44897
work page 2023
-
[68]
Wayne Xin Zhao, Shanlei Mu, Yupeng Hou, Zihan Lin, Yushuo Chen, Xingyu Pan, Kaiyuan Li, Yujie Lu, Hui Wang, Changxin Tian, Yingqian Min, Zhichao Feng, Xinyan Fan, Xu Chen, Pengfei Wang, Wendi Ji, Yaliang Li, Xiaoling Wang, and Ji-Rong Wen. 2021. RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms. In CIKM. ACM, ...
work page 2021
-
[69]
Wayne Xin Zhao, Shanlei Mu, Yupeng Hou, Zihan Lin, Yushuo Chen, Xingyu Pan, Kaiyuan Li, Yujie Lu, Hui Wang, Changxin Tian, Yingqian Min, Zhichao Feng, Xinyan Fan, Xu Chen, Pengfei Wang, Wendi Ji, Yaliang Li, Xiaoling Wang, and Ji-Rong Wen. 2021. RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms. In Proceedings...
-
[70]
Unspecified settings follow the defaults of the torchtune and verl frameworks. We consider two training setups: • Single-Stage SFT: For models without the decision-making process, we apply supervised fine-tuning only, treating the task as single-token classification. • Two-Stage Fine-Tuning: For models incorporating decision-making, we perform a standard ...
work page 1998
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.