KuaiLive is the first publicly released real-time interactive dataset for live streaming recommendation, with logs from 23,772 users and 452,621 streamers over 21 days plus timestamps, multi-type interactions, and side features.
hub Canonical reference
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
Canonical reference. 75% of citing Pith papers cite this work as background.
abstract
Recently, generative retrieval-based recommendation systems have emerged as a promising paradigm. However, most modern recommender systems adopt a retrieve-and-rank strategy, where the generative model functions only as a selector during the retrieval stage. In this paper, we propose OneRec, which replaces the cascaded learning framework with a unified generative model. To the best of our knowledge, this is the first end-to-end generative model that significantly surpasses current complex and well-designed recommender systems in real-world scenarios. Specifically, OneRec includes: 1) an encoder-decoder structure, which encodes the user's historical behavior sequences and gradually decodes the videos that the user may be interested in. We adopt sparse Mixture-of-Experts (MoE) to scale model capacity without proportionally increasing computational FLOPs. 2) a session-wise generation approach. In contrast to traditional next-item prediction, we propose a session-wise generation, which is more elegant and contextually coherent than point-by-point generation that relies on hand-crafted rules to properly combine the generated results. 3) an Iterative Preference Alignment module combined with Direct Preference Optimization (DPO) to enhance the quality of the generated results. Unlike DPO in NLP, a recommendation system typically has only one opportunity to display results for each user's browsing request, making it impossible to obtain positive and negative samples simultaneously. To address this limitation, We design a reward model to simulate user generation and customize the sampling strategy. Extensive experiments have demonstrated that a limited number of DPO samples can align user interest preferences and significantly improve the quality of generated results. We deployed OneRec in the main scene of Kuaishou, achieving a 1.6\% increase in watch-time, which is a substantial improvement.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
OneRetrieval unifies multi-branch e-commerce retrieval into a single editable generative model using keyword-aligned encoding and information-theoretic codebook grouping.
TRACER uses token reassignment for concept-related items plus a coherence regularizer to unlearn specific concepts in generative recommendation while preserving utility better than baselines.
PrefixMem encoder for Semantic IDs improves deepest-level accuracy by up to 46% relative and full-SID retrieval recall by up to 22% relative on Pinterest data across LLM families.
QGS introduces query-item pair encoding and query-conditioned prediction with a linear HSTU encoder and HFG-Attention to reduce noise from query switches in generative search ranking, reporting online gains in a commercial system.
Semantic-ID tokenizers produce collisions affecting up to 30.5% of items across four datasets, inflating Hit@10 by up to 103.36% and making prior tokenizer comparisons unreliable.
UTTSI selectively scales test-time compute for CTR prediction by triggering stochastic feature-path exploration only on high-uncertainty instances, yielding gains on four datasets and a 5.3% online CTR lift.
A single autoregressive model for conversational recommendation that uses semantic item IDs, predicts response intent and target first, then generates the response, reporting up to 29% Recall@1 gains.
VarLenRec learns variable-length semantic IDs for generative recommendation by allocating longer codes to tail items via popularity-weighted information budget allocation, hyperbolic residual quantization, and a differentiable soft length controller.
AsymRec decouples input and output representations in generative recommendation via multi-expert semantic projection and multi-faceted hierarchical quantization, outperforming prior models by 15.8% on average.
SID-MLP distills autoregressive generative recommenders into efficient position-specific MLP heads for Semantic ID tasks, achieving 8.74x faster inference with matching accuracy.
AWARE augments generative next-POI recommendation with LLM agents that produce user-anchored narratives capturing events, culture, and trends, delivering up to 12.4% relative gains on three real datasets.
Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.
Exact LTI Koopman models for nonlinear control systems require affine linear dynamics under controllability and coordinate inclusion assumptions.
GREW uses a secret-key-driven green-red item partition and three ranking-integrated modules to embed verifiable watermarks in recommender systems that resist extraction attacks without data injection.
Beam-search negatives induce partial AUC optimization in GRPO for LLM recommenders; Windowed Partial AUC and TAWin improve Top-K alignment on four datasets.
ResRank unifies retrieval and listwise reranking by compressing passages to one token each, using residual connections and cosine-similarity scoring, achieving competitive effectiveness on TREC DL and BEIR benchmarks with zero generated tokens.
Auto-regressive next-token prediction is strictly equivalent to full-vocabulary maximum likelihood estimation in generative recommendation under bijective item-to-token-sequence mapping.
DUET uses a three-stage joint profile generator with RL feedback to create consistent user-item textual profiles that outperform independent generation in recommendation tasks.
IAT compresses each historical interaction instance into a unified embedding token via temporal-order or user-order schemes, allowing standard sequence models to learn long-range preferences with better performance and transferability.
Red-Rec uses AI-initiated summaries and low-effort option selection to help users with vague intent explore more broadly and with higher serendipity than user-initiated chat while requiring less typing.
GenRecEdit injects cold-start items into generative recommendation models via context-aware token editing and interference-reducing triggers, boosting cold-start accuracy while using only 9.5% of retraining time.
RAD-DPO adds token-level gradient detachment, similarity-based dynamic reward weighting, and a multi-label global contrastive objective to DPO for better handling of hierarchical Semantic IDs and noisy feedback in e-commerce generative retrieval.
UG-Separation framework disentangles user-side and item-side flows in TokenMixer dense-interaction models to enable reusable user computations, cutting inference latency up to 20% in ByteDance production scenarios.
citing papers explorer
-
SCASRec: A Self-Correcting and Auto-Stopping Model for Generative Route List Recommendation
SCASRec unifies ranking and redundancy elimination for route lists via stepwise corrective rewards and an adaptive end-of-recommendation token, claiming SOTA results on two datasets and real deployment.
-
A Survey on Generative Recommendation: Data, Model, and Tasks
This survey organizes generative recommendation into data, model, and task dimensions, identifying five advantages including world knowledge integration and creative generation while noting challenges in benchmarks and efficiency.
-
Bi-Level Optimization for Generative Recommendation: Bridging Tokenization and Generation
BLOGER is a bi-level optimization framework that jointly optimizes the tokenizer and recommender for generative recommendation, outperforming prior methods on real-world datasets.
-
Next Interest Flow: A Generative Pre-training Paradigm for Recommender Systems by Modeling All-domain Movelines
Next Interest Flow models user intent as continuous evolutionary trajectories on a high-dimensional latent interest manifold with kinematic constraints, bidirectional alignment, and temporal causality mechanisms, yielding reported gains on industrial CTR data.
-
Sequential Data Augmentation for Generative Recommendation
GenPAS unifies common data augmentation strategies for generative recommendation as special cases of a bias-controlled stochastic sampling process and demonstrates gains in accuracy, data efficiency, and parameter efficiency on benchmarks and industrial data.
-
Mirroring Users: Towards Building Preference-aligned User Simulator with User Feedback in Recommendation
A two-phase data construction framework generates explanatory rationales from user feedback and applies uncertainty-based distillation to fine-tune lightweight LLMs as preference-aligned user simulators for recommender systems.
-
Generative Bid Shading in Real-Time Bidding Advertising
GBS replaces two-stage bid landscape modeling with an autoregressive generative model plus reward-aligned policy optimization to improve short- and long-term advertiser surplus in real-time bidding.
-
GR2 Technical Report
GR2 applies mid-training on semantic IDs, reasoning distillation, RL with conditional verifiable rewards, and a context compressor to re-ranking in industrial recsys, reporting +18.7% R@1 over baselines.
-
Adaptive Loss Balancing for Noise-Robust GRPO in Generative Recommendation
AdaGRPO gates GRPO reinforcement learning with supervised NLL using per-sample binary clips based on policy difficulty and reward discriminability, raising HR@10 from 11.01% to 12.18% while keeping hallucination below 0.22% on large-scale e-commerce data and showing A/B gains.
-
SSRLive: Live Streaming Recommendation with Dynamic Semantic ID
SSRLive combines generative and discriminative modules with dynamic semantic IDs to improve live streaming recommendations, reporting gains of +3.38% watch time, +0.72% GMV, +3.12% follower growth, and +2.92% interaction volume in online A/B tests.
-
Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization
PHF applies Bourdieu's Theory of Practice to create hierarchical user models for LLM personalization and reports consistent gains on the LaMP benchmark.
-
Towards Sustainable Growth: A Multi-Value-Aware Retrieval Framework for E-Commerce Search
GrowthGR combines ItemLTV counterfactual prediction with MultiGR generative retrieval and MoPO optimization to deliver 5.3% new item GMV lift and 0.3% overall GMV gain on Taobao production.
-
Discrimination Is Generation: Unifying Ranking and Retrieval from a Tokenizer Perspective
DIG unifies ranking and retrieval by training the tokenizer jointly inside a ranking model, producing improved models for both from a single run.
-
Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL
CQ-SID semantic IDs and EG-GRPO RL improve generative retrieval hit rates up to 26.76% over RQ-VAE baselines and deliver +1.15% GMV in live e-commerce A/B tests.
-
UxSID: Semantic-Aware User Interests Modeling for Ultra-Long Sequence
UxSID models ultra-long user sequences with semantic-group shared interest memory using Semantic IDs and dual-level attention, achieving state-of-the-art performance and a 0.337% revenue lift in advertising A/B tests.
-
Revisiting General Map Search via Generative Point-of-Interest Retrieval
GenPOI is a generative POI retrieval system that unifies heterogeneous contexts via LLMs, uses geo-semantic tokenization, and applies proximity constraints to achieve superior performance on large-scale map search data.
-
Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations
Bian Que is an agentic framework using a unified operational paradigm, flexible Skill Arrangement, and self-evolving mechanism to automate O&M tasks, achieving 75% alert reduction and over 50% MTTR cut in production deployment.
-
Harmonizing Generative Retrieval and Ranking in Chain-of-Recommendation
RecoChain unifies generative candidate generation via hierarchical semantic IDs and SIM-based ranking in a single Transformer to improve top-K recommendation performance.
-
Mitigating Collaborative Semantic ID Staleness in Generative Retrieval
A model-agnostic SID alignment update mitigates staleness from temporal drift in user-item interactions for generative retrievers, improving Recall@K and nDCG@K while reducing compute by 8-9x versus full retraining.
-
SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search
SID-Coord coordinates semantic IDs with hashed item IDs via attention fusion, adaptive gating, and interest alignment, yielding +0.664% long-play rate and +0.369% playback duration gains in production search ranking.
-
SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress
SIGMA deploys a semantic-grounded, instruction-driven generative model with hybrid tokenization and adaptive fusion for multi-task recommendation at AliExpress.
-
Denoising Neural Reranker for Recommender Systems
DNR is an adversarial denoising neural reranker that extends score error minimization with three objectives to denoise retriever scores and align them with user feedback in two-stage recommender systems.
-
Learning Decomposed Contextual Token Representations from Pretrained and Collaborative Signals for Generative Recommendation
DECOR learns decomposed contextual token representations by combining pretrained semantics with collaborative signals to fix objective misalignment in two-stage generative recommendation systems.
-
CMSL: Constructive Multi-Sequence Learning for Recommendation Systems
CMSL uses a learnable module to disentangle user history into multiple pure sequences modeled with linear attention to improve recommendation performance over single-sequence approaches.
-
Structuring and Tokenizing Distributed User Interest Context for Generative Recommendation
G2Rec unifies holistic graph-based user co-engagement modeling with semantic tokenization for scalable generative recommendation without ground-truth user interests.
-
DSIRM: Learning Query-Bridged Discrete Semantic Identifiers for E-commerce Relevance Modeling
DSIRM uses query-bridged contrastive quantization and generative LLMs to create relevance-aware discrete semantic identifiers, reporting +1.54% offline AUC and online lifts on Tmall production data.
-
Taiji: Pareto Optimal Policy Optimization with Semantics-IDs Trade-off for Industrial LLM-Enhanced Recommendation
Taiji presents a LLM-as-Enhancer system with reverse-engineered CoT data generation and Pareto Optimal Policy Optimization (POPO) to trade off semantic and ID rewards, deployed at Kuaishou serving 400M daily users.
-
MuChator: Enabling Active Music Discovery via Conversational Music LLMs in Douyin Music
MuChator introduces a three-component MusicLLM system (staged knowledge pre-training, automated triplet instruction tuning, hybrid RM with GRPO) that outperforms Gemini-3-Pro on internal datasets and yields 46.49% higher user active days after deployment on Douyin Music.
-
RecGPT-Mobile: On-Device Large Language Models for User Intent Understanding in Taobao Feed Recommendation
RecGPT-Mobile runs a compact LLM on phones to understand evolving user intent from behaviors and improve mobile e-commerce recommendations.
-
OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework
OneSearch-V2 improves generative retrieval via latent reasoning and self-distillation, achieving +3.98% item CTR, +2.07% buyer volume, and +2.11% order volume in online A/B tests.
-
Joint Model Parameter Scaling and Universal-Domain Data Integration for E-commerce Search Ranking
UniScale couples entire-space data construction with a hierarchical fusion transformer to improve scaling behavior and deliver 1.70% purchase and 2.04% GMV lifts in large-scale e-commerce search A/B tests.
-
Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback
Advocates prioritizing explicit contextual feedback in LLM-based recommender systems to improve user preference alignment and explainability.