LLM-Based User Personas for Recommendations at Scale

Ben Most; Ed H. Chi; Fabio Soldo; Gregory Hinkson; Haokai Lu; Haoting Wang; Jenny Huang; Konstantina Christakopoulou; Lichan Hong; Minmin Chen

arxiv: 2606.12198 · v1 · pith:TS6D5DAJnew · submitted 2026-06-10 · 💻 cs.IR

LLM-Based User Personas for Recommendations at Scale

Haoting Wang , Haokai Lu , Zheyun Feng , Jenny Huang , Yifat Amir , Gregory Hinkson , Ben Most , Zelong Zhao

show 9 more authors

Yixin Kelly Cui Rein Zhang Fabio Soldo Yu Xia Nihar Bhupalam Minmin Chen Konstantina Christakopoulou Lichan Hong Ed H. Chi

This is my paper

Pith reviewed 2026-06-27 08:05 UTC · model grok-4.3

classification 💻 cs.IR

keywords large language modelsuser personasrecommendation systemsreal-time inferenceexploitation-exploration tradeoffvideo recommendationsknowledge distillationA/B testing

0 comments

The pith

A framework generates real-time LLM-based natural-language user personas to improve video recommendations at scale by balancing exploitation and exploration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows how to create natural-language descriptions of user interests using LLMs in real time for large-scale video recommendations. The personas summarize what users already like while adding new topics to encourage exploration. To handle the scale, the system uses knowledge distillation, asynchronous inference, and optimized inputs. Tests including live A/B experiments indicate better viewer value. Readers should care because it makes advanced language model capabilities practical for everyday recommendation systems without heavy offline computation.

Core claim

The paper establishes that real-time generation of LLM-based user interest personas, which combine summaries of existing interests with novel topics, can be achieved at billion-user scale through a cost-efficient architecture leveraging knowledge distillation, asynchronous inference, and semantically clustered video representations, resulting in significant improvements in viewer value as measured by offline evaluations, user studies, and live A/B tests.

What carries the argument

The real-time persona generation framework that addresses the exploitation-exploration trade-off directly during serving using LLMs.

If this is right

User profiles become more semantically rich and interpretable without relying on structured IDs.
Recommendations can adapt dynamically to new interests at serving time rather than offline.
The balance between known and novel content is handled within the persona itself.
Computational costs of LLM inference are mitigated for production environments serving billions of users.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may generalize to other platforms by replacing video clusters with domain-specific item representations.
Natural language personas could enable user-facing explanations or controls over their recommendation profiles.
Combining this with traditional ranking models might reduce reliance on separate diversity mechanisms.
Long-term effects on user engagement could be measured in extended A/B tests beyond immediate viewer value.

Load-bearing premise

That integrating these generated natural-language personas into the recommendation model produces measurable gains in viewer value at production scale.

What would settle it

An A/B test at scale where the LLM persona method shows no improvement or a decrease in key viewer value metrics compared to the existing system.

Figures

Figures reproduced from arXiv: 2606.12198 by Ben Most, Ed H. Chi, Fabio Soldo, Gregory Hinkson, Haokai Lu, Haoting Wang, Jenny Huang, Konstantina Christakopoulou, Lichan Hong, Minmin Chen, Nihar Bhupalam, Rein Zhang, Yifat Amir, Yixin Kelly Cui, Yu Xia, Zelong Zhao, Zheyun Feng.

**Figure 2.** Figure 2: Asynchronous online inference diagram Asynchronous LLM Inference The asynchronous online inference process (illustrated in [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Offline evaluation: User representation To illustrate the impact of input structure, [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: The proposed method drives viewer value. Y-axis [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Large Language Models (LLMs) offer unprecedented potential for enhancing recommendation systems through their world knowledge and reasoning capabilities. However, existing approaches often rely on structured IDs or offline processing, limiting semantic richness, real-time adaptability, and user-facing interpretability. In this paper, we introduce a novel framework that enables real-time generation of LLM-based user interest personas for a large-scale commercial video recommendation platform. Our method generates natural-language user interest personas that address the exploitation-exploration trade-off by combining the summarization of existing interests with novel topics, directly during serving. To overcome the computational challenges of online LLM inference at a billion-user scale, we design a cost-efficient architecture leveraging knowledge distillation, asynchronous inference, and input optimization via semantically clustered video representations. Extensive offline evaluations, user studies, and live A/B tests demonstrate significant improvements in viewer value. This work bridges the gap between high-level semantic understanding and industrial-scale recommendation, paving the way for more dynamic, explainable, and satisfying personalized experiences.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows a practical production system for real-time LLM user personas at scale via distillation and clustering, with multi-stage evidence including live A/B tests.

read the letter

The main thing to know is that the authors have implemented and tested a system for real-time generation of natural-language user personas using LLMs in a billion-user video recommendation platform. They address scale with knowledge distillation, async inference, and semantic clustering, and report improvements from offline tests, user studies, and live A/B experiments.

The new part is the specific combination of real-time persona creation that mixes known and novel interests, made efficient enough for serving. The paper does well in describing the full architecture and how the persona text gets incorporated into the ranker. Having multiple layers of evaluation, especially the online A/B tests, gives the claims more weight than papers that rely only on offline metrics.

Soft spots are minor. While the full text apparently includes the necessary details on metrics and baselines per the stress test, the magnitude of gains isn't obvious from the abstract alone, and it would be useful to see more on whether the clustering or the LLM component drives most of the benefit. The work is tied to video content, which may limit direct applicability elsewhere.

This is useful for people building large-scale recsys who are considering LLMs. A practitioner reader would find the implementation choices and trade-offs valuable. It shows honest engagement with the practical constraints.

I would bring this to a reading group to discuss the scaling techniques. I wouldn't cite it soon in my own papers, but the lessons on efficiency could be relevant. It should go to peer review as the evidence is from real deployments and the argument is internally consistent.

Referee Report

0 major / 1 minor

Summary. The paper introduces a framework for real-time generation of natural-language user interest personas using LLMs for a large-scale video recommendation platform. The personas combine summarization of existing interests with novel topics to balance exploitation and exploration. To enable this at billion-user scale, the approach employs knowledge distillation, asynchronous inference, and semantically clustered video representations. The framework is evaluated through offline evaluations, user studies, and live A/B tests, which demonstrate significant improvements in viewer value.

Significance. If the empirical results hold, this work provides a practical bridge between LLM semantic capabilities and industrial-scale recommendation systems. It offers a cost-efficient architecture for real-time persona generation and integration, potentially improving personalization, interpretability, and user satisfaction. The multi-faceted evaluation (offline, user study, A/B) strengthens the claims of practical impact.

minor comments (1)

[Abstract] Abstract: the claim of 'significant improvements in viewer value' is not accompanied by any quantitative metrics, baselines, or statistical details, even though the full manuscript supplies them; adding one sentence with key results would improve standalone readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary and recommendation of minor revision. The assessment that the work provides a practical bridge between LLM capabilities and large-scale recommendations, supported by multi-faceted evaluation, is appreciated. No specific major comments were listed in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents an empirical framework for real-time LLM-generated natural-language user personas in a large-scale video recommendation system. It relies on architectural choices (distillation, async inference, clustered representations) and demonstrates gains via offline evaluations, user studies, and live A/B tests. No equations, derivations, parameter fittings, or self-referential definitions appear in the described method; the central claims rest on external empirical measurements rather than reducing to inputs by construction. No load-bearing self-citations or uniqueness theorems are invoked that collapse the argument.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the framework implicitly assumes LLMs can produce useful personas without detailing any fitted values or background assumptions.

pith-pipeline@v0.9.1-grok · 5756 in / 1053 out tokens · 20392 ms · 2026-06-27T08:05:50.585172+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 10 canonical work pages

[1]

Keqin Bao, Jizhi Zhang, Yang Zhang, Wenjie Wang, Fuli Feng, and Xiangnan He. 2023. TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation. InProceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23). ACM, 1007–1014. https://doi. org/10.1145/3604915.3608857

work page doi:10.1145/3604915.3608857 2023
[2]

Moumita Bhattacharya, Vito Ostuni, and Sudarshan Lamkhede. 2024. Joint Mod- eling of Search and Recommendations Via an Unified Contextual Recommender (UniCoRn). arXiv:2408.10394 [cs.IR] https://arxiv.org/abs/2408.10394

arXiv 2024
[3]

Chi, and Minmin Chen

Konstantina Christakopoulou, Alberto Lalama, Cj Adams, Iris Qu, Yifat Amir, Samer Chucri, Pierce Vollucci, Fabio Soldo, Dina Bseiso, Sarah Scodel, Lucas Dixon, Ed H. Chi, and Minmin Chen. 2023. Large Language Models for User Interest Journeys. arXiv:2305.15498 [cs.CL] https://arxiv.org/abs/2305.15498

arXiv 2023
[4]

Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. InProceedings of the 10th ACM Conference on Recommender Systems(Boston, Massachusetts, USA)(RecSys ’16). Association for Computing Machinery, New York, NY, USA, 191–198. https://doi.org/10.1145/ 2959100.2959190

arXiv 2016
[5]

Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Mur- phy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. InProceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, New York, USA)(KDD ’14). Association fo...

work page doi:10.1145/2623330.2623623 2014
[6]

Francesco Fabbri, Gustavo Penha, Edoardo D’Amico, Alice Wang, Marco De Nadai, Jackie Doremus, Paul Gigioli, Andreas Damianou, Oskar Stål, and Mounia Lalmas
[7]

InProceedings of the Nineteenth ACM Conference on Recommender Systems (RecSys ’25)

Evaluating Podcast Recommendations with Profile-Aware LLM-as-a-Judge. InProceedings of the Nineteenth ACM Conference on Recommender Systems (RecSys ’25). ACM, 1181–1186. https://doi.org/10.1145/3705328.3759305

work page doi:10.1145/3705328.3759305
[8]

Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2023. Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5). arXiv:2203.13366 [cs.IR] https://arxiv.org/abs/ 2203.13366

arXiv 2023
[9]

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating Large-Scale Inference with Anisotropic Vector Quantization. InInternational Conference on Machine Learning. https: //arxiv.org/abs/1908.10396

arXiv 2020
[10]

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk
[11]

arXiv:1511.06939 [cs.LG] https://arxiv.org/abs/1511.06939

Session-based Recommendations with Recurrent Neural Networks. arXiv:1511.06939 [cs.LG] https://arxiv.org/abs/1511.06939

Pith/arXiv arXiv
[12]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. InProceedings of the 22nd ACM International Conference on Information & Knowledge Management(San Francisco, California, USA)(CIKM ’13). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/2505515.2505665 2013
[13]

Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recom- mendation. arXiv:1808.09781 [cs.IR] https://arxiv.org/abs/1808.09781

Pith/arXiv arXiv 2018
[14]

Aditee Kumthekar, Li Wei, Andrea Bettale, Mahesh Sathiamoorthy, Zrinka Puljiz, and Aditya Mahajan. 2025. Never Miss an Episode: How LLMs are Powering Serial Content Discovery on YouTube. https://doi.org/10.1145/3705328.3748104

work page doi:10.1145/3705328.3748104 2025
[15]

Yiqun Liu, Kaushik Rangadurai, Yunzhong He, Siddarth Malreddy, Xunlong Gui, Xiaoyi Liu, and Fedor Borisyuk. 2021. Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining(Virtual Event, Singapore)(KDD ’21). Association for Computing Machinery...

work page doi:10.1145/3447548.3467127 2021
[16]

Boyuan Long, Yueqi Wang, Hiloni Mehta, Mick Zomnir, Omkar Pathak, Chang- ping Meng, Ruolin Jia, Yajun Peng, Dapeng Hong, Xia Wu, Mingyan Gao, Onkar Dalal, and Ningren Han. 2025. LLM-Powered Nuanced Video Attribute Annota- tion for Enhanced Recommendations. https://doi.org/10.1145/3705328.3748103

work page doi:10.1145/3705328.3748103 2025
[18]

Changping Meng, Hongyi Ling, Jianling Wang, Yifan Liu, Shuzhou Zhang, Dapeng Hong, Mingyan Gao, Onkar Dalal, Ed Chi, Lichan Hong, Haokai Lu, and Ningren Han. 2025. Balancing Fine-tuning and RAG: A Hybrid Strat- egy for Dynamic LLM Recommendation Updates. InProceedings of the Nine- teenth ACM Conference on Recommender Systems (RecSys ’25). ACM, 919–922. ht...

work page doi:10.1145/3705328.3748105 2025
[19]

Anand Rajagopalan, Fabio Vitale, Danny Vainstein, Gui Citovsky, Cecilia M Procopiuc, and Claudio Gentile. 2021. Hierarchical Clustering of Data Streams: Scalable Algorithms and Approximation Guarantees. InProceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (...

2021
[20]

Thibault Sellam, Dipanjan Das, and Ankur P. Parikh. 2020. BLEURT: Learning Robust Metrics for Text Generation. arXiv:2004.04696 [cs.CL] https://arxiv.org/ abs/2004.04696

arXiv 2020
[21]

Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential Recommendation with Bidirectional En- coder Representations from Transformer. arXiv:1904.06690 [cs.IR] https: //arxiv.org/abs/1904.06690

arXiv 2019
[22]

Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al. 2024. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.arXiv preprint arXiv:2403.05530(2024)

Pith/arXiv arXiv 2024
[23]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2023. Attention Is All You Need. arXiv:1706.03762 [cs.CL] https://arxiv.org/abs/1706.03762

Pith/arXiv arXiv 2023
[24]

Chi, Lichan Hong, and Haokai Lu

Haoting Wang, Jianling Wang, Hao Li, Fangjun Yi, Mengyu Fu, Youwei Zhang, Yifan Liu, Liang Liu, Minmin Chen, Ed H. Chi, Lichan Hong, and Haokai Lu. 2025. Serendipitous Recommendation with Multimodal LLM. (2025). arXiv:2506.08283 [cs.IR] https://arxiv.org/abs/2506.08283

arXiv 2025
[25]

Chi, Lichan Hong, Ningren Han, and Haokai Lu

Jianling Wang, Yifan Liu, Yinghao Sun, Xuejian Ma, Yueqi Wang, He Ma, Zhengyang Su, Minmin Chen, Mingyan Gao, Onkar Dalal, Ed H. Chi, Lichan Hong, Ningren Han, and Haokai Lu. 2025. User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems. arXiv:2504.05522 [cs.IR] https://arxiv.org/abs/2504.05522

arXiv 2025
[26]

Jianling Wang, Haokai Lu, Yifan Liu, He Ma, Yueqi Wang, Yang Gu, Shuzhou Zhang, Ningren Han, Shuchao Bi, Lexi Baugher, et al. 2024. Llms for user interest exploration in large-scale recommendation systems. InRecSys

2024
[27]

Jianling Wang, Haokai Lu, Yifan Liu, He Ma, Yueqi Wang, Yang Gu, Shuzhou Zhang, Ningren Han, Shuchao Bi, Lexi Baugher, Ed Chi, and Minmin Chen. 2024. LLMs for User Interest Exploration in Large-scale Recommendation Systems. arXiv:2405.16363 [cs.IR] https://arxiv.org/abs/2405.16363

arXiv 2024
[28]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903 [cs.CL] https: //arxiv.org/abs/2201.11903

Pith/arXiv arXiv 2023
[29]

Shu Wu, Yuyuan Tang, Yanqiao Zhu, Liang Wang, Xing Xie, and Tieniu Tan. 2019. Session-Based Recommendation with Graph Neural Networks.Proceedings of the AAAI Conference on Artificial Intelligence33, 01 (July 2019), 346–353. https: //doi.org/10.1609/aaai.v33i01.3301346

work page doi:10.1609/aaai.v33i01.3301346 2019
[30]

Fan Yang, Zheng Chen, Ziyan Jiang, Eunah Cho, Xiaojiang Huang, and Yan- bin Lu. 2023. PALR: Personalization Aware LLMs for Recommendation. arXiv:2305.07622 [cs.IR] https://arxiv.org/abs/2305.07622

arXiv 2023
[31]

Bruce Croft

Hamed Zamani and W. Bruce Croft. 2018. Joint Modeling and Optimization of Search and Recommendation. arXiv:1807.05631 [cs.IR] https://arxiv.org/abs/ 1807.05631

Pith/arXiv arXiv 2018
[32]

Haiyuan Zhao, Lei Zhang, Jun Xu, Guohao Cai, Zhenhua Dong, and Ji-Rong Wen
[33]

Quiet Companionability Seeker

Uncovering User Interest from Biased and Noised Watch Time in Video Recommendation. InProceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23). ACM, 528–539. https://doi.org/10.1145/3604915.3608797

work page doi:10.1145/3604915.3608797

[1] [1]

Keqin Bao, Jizhi Zhang, Yang Zhang, Wenjie Wang, Fuli Feng, and Xiangnan He. 2023. TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation. InProceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23). ACM, 1007–1014. https://doi. org/10.1145/3604915.3608857

work page doi:10.1145/3604915.3608857 2023

[2] [2]

Moumita Bhattacharya, Vito Ostuni, and Sudarshan Lamkhede. 2024. Joint Mod- eling of Search and Recommendations Via an Unified Contextual Recommender (UniCoRn). arXiv:2408.10394 [cs.IR] https://arxiv.org/abs/2408.10394

arXiv 2024

[3] [3]

Chi, and Minmin Chen

Konstantina Christakopoulou, Alberto Lalama, Cj Adams, Iris Qu, Yifat Amir, Samer Chucri, Pierce Vollucci, Fabio Soldo, Dina Bseiso, Sarah Scodel, Lucas Dixon, Ed H. Chi, and Minmin Chen. 2023. Large Language Models for User Interest Journeys. arXiv:2305.15498 [cs.CL] https://arxiv.org/abs/2305.15498

arXiv 2023

[4] [4]

Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. InProceedings of the 10th ACM Conference on Recommender Systems(Boston, Massachusetts, USA)(RecSys ’16). Association for Computing Machinery, New York, NY, USA, 191–198. https://doi.org/10.1145/ 2959100.2959190

arXiv 2016

[5] [5]

Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Mur- phy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. InProceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, New York, USA)(KDD ’14). Association fo...

work page doi:10.1145/2623330.2623623 2014

[6] [6]

Francesco Fabbri, Gustavo Penha, Edoardo D’Amico, Alice Wang, Marco De Nadai, Jackie Doremus, Paul Gigioli, Andreas Damianou, Oskar Stål, and Mounia Lalmas

[7] [7]

InProceedings of the Nineteenth ACM Conference on Recommender Systems (RecSys ’25)

Evaluating Podcast Recommendations with Profile-Aware LLM-as-a-Judge. InProceedings of the Nineteenth ACM Conference on Recommender Systems (RecSys ’25). ACM, 1181–1186. https://doi.org/10.1145/3705328.3759305

work page doi:10.1145/3705328.3759305

[8] [8]

Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2023. Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5). arXiv:2203.13366 [cs.IR] https://arxiv.org/abs/ 2203.13366

arXiv 2023

[9] [9]

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating Large-Scale Inference with Anisotropic Vector Quantization. InInternational Conference on Machine Learning. https: //arxiv.org/abs/1908.10396

arXiv 2020

[10] [10]

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk

[11] [11]

arXiv:1511.06939 [cs.LG] https://arxiv.org/abs/1511.06939

Session-based Recommendations with Recurrent Neural Networks. arXiv:1511.06939 [cs.LG] https://arxiv.org/abs/1511.06939

Pith/arXiv arXiv

[12] [12]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. InProceedings of the 22nd ACM International Conference on Information & Knowledge Management(San Francisco, California, USA)(CIKM ’13). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/2505515.2505665 2013

[13] [13]

Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recom- mendation. arXiv:1808.09781 [cs.IR] https://arxiv.org/abs/1808.09781

Pith/arXiv arXiv 2018

[14] [14]

Aditee Kumthekar, Li Wei, Andrea Bettale, Mahesh Sathiamoorthy, Zrinka Puljiz, and Aditya Mahajan. 2025. Never Miss an Episode: How LLMs are Powering Serial Content Discovery on YouTube. https://doi.org/10.1145/3705328.3748104

work page doi:10.1145/3705328.3748104 2025

[15] [15]

Yiqun Liu, Kaushik Rangadurai, Yunzhong He, Siddarth Malreddy, Xunlong Gui, Xiaoyi Liu, and Fedor Borisyuk. 2021. Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining(Virtual Event, Singapore)(KDD ’21). Association for Computing Machinery...

work page doi:10.1145/3447548.3467127 2021

[16] [16]

Boyuan Long, Yueqi Wang, Hiloni Mehta, Mick Zomnir, Omkar Pathak, Chang- ping Meng, Ruolin Jia, Yajun Peng, Dapeng Hong, Xia Wu, Mingyan Gao, Onkar Dalal, and Ningren Han. 2025. LLM-Powered Nuanced Video Attribute Annota- tion for Enhanced Recommendations. https://doi.org/10.1145/3705328.3748103

work page doi:10.1145/3705328.3748103 2025

[17] [18]

Changping Meng, Hongyi Ling, Jianling Wang, Yifan Liu, Shuzhou Zhang, Dapeng Hong, Mingyan Gao, Onkar Dalal, Ed Chi, Lichan Hong, Haokai Lu, and Ningren Han. 2025. Balancing Fine-tuning and RAG: A Hybrid Strat- egy for Dynamic LLM Recommendation Updates. InProceedings of the Nine- teenth ACM Conference on Recommender Systems (RecSys ’25). ACM, 919–922. ht...

work page doi:10.1145/3705328.3748105 2025

[18] [19]

Anand Rajagopalan, Fabio Vitale, Danny Vainstein, Gui Citovsky, Cecilia M Procopiuc, and Claudio Gentile. 2021. Hierarchical Clustering of Data Streams: Scalable Algorithms and Approximation Guarantees. InProceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (...

2021

[19] [20]

Thibault Sellam, Dipanjan Das, and Ankur P. Parikh. 2020. BLEURT: Learning Robust Metrics for Text Generation. arXiv:2004.04696 [cs.CL] https://arxiv.org/ abs/2004.04696

arXiv 2020

[20] [21]

Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential Recommendation with Bidirectional En- coder Representations from Transformer. arXiv:1904.06690 [cs.IR] https: //arxiv.org/abs/1904.06690

arXiv 2019

[21] [22]

Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al. 2024. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.arXiv preprint arXiv:2403.05530(2024)

Pith/arXiv arXiv 2024

[22] [23]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2023. Attention Is All You Need. arXiv:1706.03762 [cs.CL] https://arxiv.org/abs/1706.03762

Pith/arXiv arXiv 2023

[23] [24]

Chi, Lichan Hong, and Haokai Lu

Haoting Wang, Jianling Wang, Hao Li, Fangjun Yi, Mengyu Fu, Youwei Zhang, Yifan Liu, Liang Liu, Minmin Chen, Ed H. Chi, Lichan Hong, and Haokai Lu. 2025. Serendipitous Recommendation with Multimodal LLM. (2025). arXiv:2506.08283 [cs.IR] https://arxiv.org/abs/2506.08283

arXiv 2025

[24] [25]

Chi, Lichan Hong, Ningren Han, and Haokai Lu

Jianling Wang, Yifan Liu, Yinghao Sun, Xuejian Ma, Yueqi Wang, He Ma, Zhengyang Su, Minmin Chen, Mingyan Gao, Onkar Dalal, Ed H. Chi, Lichan Hong, Ningren Han, and Haokai Lu. 2025. User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems. arXiv:2504.05522 [cs.IR] https://arxiv.org/abs/2504.05522

arXiv 2025

[25] [26]

Jianling Wang, Haokai Lu, Yifan Liu, He Ma, Yueqi Wang, Yang Gu, Shuzhou Zhang, Ningren Han, Shuchao Bi, Lexi Baugher, et al. 2024. Llms for user interest exploration in large-scale recommendation systems. InRecSys

2024

[26] [27]

Jianling Wang, Haokai Lu, Yifan Liu, He Ma, Yueqi Wang, Yang Gu, Shuzhou Zhang, Ningren Han, Shuchao Bi, Lexi Baugher, Ed Chi, and Minmin Chen. 2024. LLMs for User Interest Exploration in Large-scale Recommendation Systems. arXiv:2405.16363 [cs.IR] https://arxiv.org/abs/2405.16363

arXiv 2024

[27] [28]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903 [cs.CL] https: //arxiv.org/abs/2201.11903

Pith/arXiv arXiv 2023

[28] [29]

Shu Wu, Yuyuan Tang, Yanqiao Zhu, Liang Wang, Xing Xie, and Tieniu Tan. 2019. Session-Based Recommendation with Graph Neural Networks.Proceedings of the AAAI Conference on Artificial Intelligence33, 01 (July 2019), 346–353. https: //doi.org/10.1609/aaai.v33i01.3301346

work page doi:10.1609/aaai.v33i01.3301346 2019

[29] [30]

Fan Yang, Zheng Chen, Ziyan Jiang, Eunah Cho, Xiaojiang Huang, and Yan- bin Lu. 2023. PALR: Personalization Aware LLMs for Recommendation. arXiv:2305.07622 [cs.IR] https://arxiv.org/abs/2305.07622

arXiv 2023

[30] [31]

Bruce Croft

Hamed Zamani and W. Bruce Croft. 2018. Joint Modeling and Optimization of Search and Recommendation. arXiv:1807.05631 [cs.IR] https://arxiv.org/abs/ 1807.05631

Pith/arXiv arXiv 2018

[31] [32]

Haiyuan Zhao, Lei Zhang, Jun Xu, Guohao Cai, Zhenhua Dong, and Ji-Rong Wen

[32] [33]

Quiet Companionability Seeker

Uncovering User Interest from Biased and Noised Watch Time in Video Recommendation. InProceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23). ACM, 528–539. https://doi.org/10.1145/3604915.3608797

work page doi:10.1145/3604915.3608797