End-to-end Training for Recommendation with Language-based User Profiles

Joyce Zhou; Thorsten Joachims; Yijia Dai; Zhaolin Gao

arxiv: 2410.18870 · v2 · pith:BVYDQQPEnew · submitted 2024-10-24 · 💻 cs.IR · cs.LG

End-to-end Training for Recommendation with Language-based User Profiles

Zhaolin Gao , Joyce Zhou , Yijia Dai , Thorsten Joachims This is my paper

classification 💻 cs.IR cs.LG

keywords profilestraininglangptuneuserzero-shotrecommendationcomparedembedding-based

0 comments

read the original abstract

There is a growing interest in natural language-based user profiles for recommender systems, which aims to enhance transparency and scrutability compared with embedding-based methods. Existing studies primarily generate these profiles using zero-shot inference from large language models (LLMs), but their quality remains insufficient, leading to suboptimal recommendation performance. In this paper, we introduce LangPTune, the first end-to-end training framework to optimize LLM-generated user profiles. Our method significantly outperforms zero-shot approaches by explicitly training the LLM for the recommendation objective. Through extensive evaluations across diverse training configurations and benchmarks, we demonstrate that LangPTune not only surpasses zero-shot baselines but can also matches the performance of state-of-the-art embedding-based methods. Finally, we investigate whether the training procedure preserves the interpretability of these profiles compared to zero-shot inference through both GPT-4 simulations and crowdworker user studies. Implementation of LangPTune can be found at https://github.com/ZhaolinGao/LangPTune.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

POPI: Personalizing LLMs via Optimized Natural Language Preference Inference
cs.CL 2025-10 unverdicted novelty 5.0

POPI distills user preferences into reusable natural-language summaries via a shared inference model and conditions a generator on them, trained jointly with RL to improve personalization quality while cutting context...