KuaiLive is the first publicly released real-time interactive dataset for live streaming recommendation, with logs from 23,772 users and 452,621 streamers over 21 days plus timestamps, multi-type interactions, and side features.
hub Canonical reference
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
Canonical reference. 75% of citing Pith papers cite this work as background.
abstract
Recently, generative retrieval-based recommendation systems have emerged as a promising paradigm. However, most modern recommender systems adopt a retrieve-and-rank strategy, where the generative model functions only as a selector during the retrieval stage. In this paper, we propose OneRec, which replaces the cascaded learning framework with a unified generative model. To the best of our knowledge, this is the first end-to-end generative model that significantly surpasses current complex and well-designed recommender systems in real-world scenarios. Specifically, OneRec includes: 1) an encoder-decoder structure, which encodes the user's historical behavior sequences and gradually decodes the videos that the user may be interested in. We adopt sparse Mixture-of-Experts (MoE) to scale model capacity without proportionally increasing computational FLOPs. 2) a session-wise generation approach. In contrast to traditional next-item prediction, we propose a session-wise generation, which is more elegant and contextually coherent than point-by-point generation that relies on hand-crafted rules to properly combine the generated results. 3) an Iterative Preference Alignment module combined with Direct Preference Optimization (DPO) to enhance the quality of the generated results. Unlike DPO in NLP, a recommendation system typically has only one opportunity to display results for each user's browsing request, making it impossible to obtain positive and negative samples simultaneously. To address this limitation, We design a reward model to simulate user generation and customize the sampling strategy. Extensive experiments have demonstrated that a limited number of DPO samples can align user interest preferences and significantly improve the quality of generated results. We deployed OneRec in the main scene of Kuaishou, achieving a 1.6\% increase in watch-time, which is a substantial improvement.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
OneRetrieval unifies multi-branch e-commerce retrieval into a single editable generative model using keyword-aligned encoding and information-theoretic codebook grouping.
TRACER uses token reassignment for concept-related items plus a coherence regularizer to unlearn specific concepts in generative recommendation while preserving utility better than baselines.
PrefixMem encoder for Semantic IDs improves deepest-level accuracy by up to 46% relative and full-SID retrieval recall by up to 22% relative on Pinterest data across LLM families.
QGS introduces query-item pair encoding and query-conditioned prediction with a linear HSTU encoder and HFG-Attention to reduce noise from query switches in generative search ranking, reporting online gains in a commercial system.
Semantic-ID tokenizers produce collisions affecting up to 30.5% of items across four datasets, inflating Hit@10 by up to 103.36% and making prior tokenizer comparisons unreliable.
UTTSI selectively scales test-time compute for CTR prediction by triggering stochastic feature-path exploration only on high-uncertainty instances, yielding gains on four datasets and a 5.3% online CTR lift.
A single autoregressive model for conversational recommendation that uses semantic item IDs, predicts response intent and target first, then generates the response, reporting up to 29% Recall@1 gains.
VarLenRec learns variable-length semantic IDs for generative recommendation by allocating longer codes to tail items via popularity-weighted information budget allocation, hyperbolic residual quantization, and a differentiable soft length controller.
AsymRec decouples input and output representations in generative recommendation via multi-expert semantic projection and multi-faceted hierarchical quantization, outperforming prior models by 15.8% on average.
SID-MLP distills autoregressive generative recommenders into efficient position-specific MLP heads for Semantic ID tasks, achieving 8.74x faster inference with matching accuracy.
AWARE augments generative next-POI recommendation with LLM agents that produce user-anchored narratives capturing events, culture, and trends, delivering up to 12.4% relative gains on three real datasets.
Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.
Exact LTI Koopman models for nonlinear control systems require affine linear dynamics under controllability and coordinate inclusion assumptions.
GREW uses a secret-key-driven green-red item partition and three ranking-integrated modules to embed verifiable watermarks in recommender systems that resist extraction attacks without data injection.
Beam-search negatives induce partial AUC optimization in GRPO for LLM recommenders; Windowed Partial AUC and TAWin improve Top-K alignment on four datasets.
ResRank unifies retrieval and listwise reranking by compressing passages to one token each, using residual connections and cosine-similarity scoring, achieving competitive effectiveness on TREC DL and BEIR benchmarks with zero generated tokens.
Auto-regressive next-token prediction is strictly equivalent to full-vocabulary maximum likelihood estimation in generative recommendation under bijective item-to-token-sequence mapping.
DUET uses a three-stage joint profile generator with RL feedback to create consistent user-item textual profiles that outperform independent generation in recommendation tasks.
IAT compresses each historical interaction instance into a unified embedding token via temporal-order or user-order schemes, allowing standard sequence models to learn long-range preferences with better performance and transferability.
Red-Rec uses AI-initiated summaries and low-effort option selection to help users with vague intent explore more broadly and with higher serendipity than user-initiated chat while requiring less typing.
GenRecEdit injects cold-start items into generative recommendation models via context-aware token editing and interference-reducing triggers, boosting cold-start accuracy while using only 9.5% of retraining time.
RAD-DPO adds token-level gradient detachment, similarity-based dynamic reward weighting, and a multi-label global contrastive objective to DPO for better handling of hierarchical Semantic IDs and noisy feedback in e-commerce generative retrieval.
UG-Separation framework disentangles user-side and item-side flows in TokenMixer dense-interaction models to enable reusable user computations, cutting inference latency up to 20% in ByteDance production scenarios.
citing papers explorer
-
KuaiLive: A Real-time Interactive Dataset for Live Streaming Recommendation
KuaiLive is the first publicly released real-time interactive dataset for live streaming recommendation, with logs from 23,772 users and 452,621 streamers over 21 days plus timestamps, multi-type interactions, and side features.
-
OneRetrieval: Unifying Multi-Branch E-commerce Retrieval with an Editable Generative Model
OneRetrieval unifies multi-branch e-commerce retrieval into a single editable generative model using keyword-aligned encoding and information-theoretic codebook grouping.
-
TRACER: Token ReAssignment for Concept ERasure in Generative Recommendation
TRACER uses token reassignment for concept-related items plus a coherence regularizer to unlearn specific concepts in generative recommendation while preserving utility better than baselines.
-
LLMs Need Encoders for Semantic IDs Too
PrefixMem encoder for Semantic IDs improves deepest-level accuracy by up to 46% relative and full-SID retrieval recall by up to 22% relative on Pinterest data across LLM families.
-
From Item-Only to Query-Item: Query-Conditioned Generative Search with QGS in Quark
QGS introduces query-item pair encoding and query-conditioned prediction with a linear HSTU encoder and HFG-Attention to reduce noise from query switches in generative search ranking, reporting online gains in a commercial system.
-
How Reliable Are Semantic-ID Tokenizer Comparisons in Generative Recommendation?
Semantic-ID tokenizers produce collisions affecting up to 30.5% of items across four datasets, inflating Hit@10 by up to 103.36% and making prior tokenizer comparisons unreliable.
-
Selective Test-Time Compute Scaling for Click-Through Rate Prediction via Uncertainty-Triggered Feature Path Exploration
UTTSI selectively scales test-time compute for CTR prediction by triggering stochastic feature-path exploration only on high-uncertainty instances, yielding gains on four datasets and a 5.3% online CTR lift.
-
Generative Conversational Recommender System
A single autoregressive model for conversational recommendation that uses semantic item IDs, predicts response intent and target first, then generates the response, reporting up to 29% Recall@1 gains.
-
Learning Variable-Length Tokenization for Generative Recommendation
VarLenRec learns variable-length semantic IDs for generative recommendation by allocating longer codes to tail items via popularity-weighted information budget allocation, hyperbolic residual quantization, and a differentiable soft length controller.
-
Asymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization
AsymRec decouples input and output representations in generative recommendation via multi-expert semantic projection and multi-faceted hierarchical quantization, outperforming prior models by 15.8% on average.
-
MLPs are Efficient Distilled Generative Recommenders
SID-MLP distills autoregressive generative recommenders into efficient position-specific MLP heads for Semantic ID tasks, achieving 8.74x faster inference with matching accuracy.
-
Why Users Go There: World Knowledge-Augmented Generative Next POI Recommendation
AWARE augments generative next-POI recommendation with LLM agents that produce user-anchored narratives capturing events, culture, and trends, delivering up to 12.4% relative gains on three real datasets.
-
Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation
Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.
-
Limitations of LTI Koopman Modeling for Nonlinear Control Systems
Exact LTI Koopman models for nonlinear control systems require affine linear dynamics under controllability and coordinate inclusion assumptions.
-
Green-Red Watermarking for Recommender Systems
GREW uses a secret-key-driven green-red item partition and three ranking-integrated modules to embed verifiable watermarks in recommender systems that resist extraction attacks without data injection.
-
Objective Shaping with Hard Negatives: Windowed Partial AUC Optimization for RL-based LLM Recommenders
Beam-search negatives induce partial AUC optimization in GRPO for LLM recommenders; Windowed Partial AUC and TAWin improve Top-K alignment on four datasets.
-
ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression
ResRank unifies retrieval and listwise reranking by compressing passages to one token each, using residual connections and cosine-similarity scoring, achieving competitive effectiveness on TREC DL and BEIR benchmarks with zero generated tokens.
-
On the Equivalence Between Auto-Regressive Next Token Prediction and Full-Item-Vocabulary Maximum Likelihood Estimation in Generative Recommendation--A Short Note
Auto-regressive next-token prediction is strictly equivalent to full-vocabulary maximum likelihood estimation in generative recommendation under bijective item-to-token-sequence mapping.
-
DUET: Joint Exploration of User Item Profiles in Recommendation System
DUET uses a three-stage joint profile generator with RL feedback to create consistent user-item textual profiles that outperform independent generation in recommendation tasks.
-
IAT: Instance-As-Token Compression for Historical User Sequence Modeling in Industrial Recommender Systems
IAT compresses each historical interaction instance into a unified embedding token via temporal-order or user-order schemes, allowing standard sequence models to learn long-range preferences with better performance and transferability.
-
From Passive Feeds to Guided Discovery: AI-Initiated Interaction for Vague Intent in Content Exploration
Red-Rec uses AI-initiated summaries and low-effort option selection to help users with vague intent explore more broadly and with higher serendipity than user-initiated chat while requiring less typing.
-
GenRecEdit: Adapting Model Editing for Generative Recommendation with Cold-Start Items
GenRecEdit injects cold-start items into generative recommendation models via context-aware token editing and interference-reducing triggers, boosting cold-start accuracy while using only 9.5% of retraining time.
-
RAD-DPO: Robust Adaptive Denoising Direct Preference Optimization for Generative Retrieval in E-commerce
RAD-DPO adds token-level gradient detachment, similarity-based dynamic reward weighting, and a multi-label global contrastive objective to DPO for better handling of hierarchical Semantic IDs and noisy feedback in e-commerce generative retrieval.
-
Compute Only Once: UG-Separation for Efficient Large Recommendation Models
UG-Separation framework disentangles user-side and item-side flows in TokenMixer dense-interaction models to enable reusable user computations, cutting inference latency up to 20% in ByteDance production scenarios.
-
S$^2$GR: Stepwise Semantic-Guided Reasoning in Latent Space for Generative Recommendation
S²GR adds stepwise thinking tokens with contrastive supervision on codebook clusters to balance computational focus and ground reasoning paths in generative recommendation.
-
Do Recommendation Algorithms Work When Users Are LLM Agents? A Case Study on Moltbook
On the Moltbook platform populated by LLM agents, popularity-based and item-side collaborative filtering methods outperform user-representation techniques for predicting next forum engagement.
-
Recommendation as Generation: Unifying Personalized Video Generation and Recommendation at Industrial Scale
RaG unifies generative recommendation and video generation via semantic IDs and Video Generation Agents with cross-domain rewards, reporting 1.87% ad revenue lift in a 400M-user industrial deployment.
-
Navigating User Behavior toward Personalized Multimodal Generation
NaviGen encodes user behavior via dual collaborative-textual identifiers and applies SFT+RL to produce personalized multimodal outputs and better instructions from interaction history.
-
Do Generative Recommenders Deepen the Information Cocoon? A Closed-Loop Simulation with LLM-powered User Simulators
Closed-loop LLM simulations find generative recommenders form fewer exposure-level information cocoons than traditional sequential baselines on Amazon data, though tokenization strategy and model scale affect concentration in generated SID space.
-
SafeGEO: Understanding Generative Engine Optimization Risks in Recommendation Agents
SafeGEO benchmark demonstrates that GEO attacks raise flawed product inclusion in recommendation sets by up to 83.2%, with partial mitigation from defensive prompting and evidence checks.
-
Gated Bidirectional Linear Attention for Generative Retrieval
GBLA extends kernelized linear attention with local causal mixing, key gating, and gated RMSNorm; a 1:2 hybrid with self-attention matches full bidirectional self-attention quality on Yandex Music data while delivering up to 8.2x speedup at length 32768.
-
DREAM: Dynamic Refinement of Early Assignment Mappings
DREAM proposes intent-aware tokenization, frozen-model evaluation, and dynamic beams to refine early SID assignments and improve cold-start performance in generative recommenders on Amazon benchmarks.
-
Dual-Stream MLP is All You Need for CTR Prediction
DS-MLP achieves state-of-the-art CTR prediction on three benchmarks using a final vanilla MLP structure trained via knowledge distillation and two alignment strategies.
-
Trustworthy Recommendation in the Era of Large Language Models: Opportunities and Challenges
A systematic review of over 200 studies concludes that LLMs in recommender systems act as a double-edged sword, creating both opportunities and new risks for trustworthiness.
-
UniPinRec: Unifying Generative Retrieval and Ranking at Pinterest Scale
UniPinRec unifies retrieval and ranking into a single model and pipeline deployed at Pinterest, reporting +1% engagement lift, 11.1% lower latency, and 63.6% higher QPS.
-
FLUID: From Ephemeral IDs to Multimodal Semantic Codes for Industrial-Scale Livestreaming Recommendation
FLUID introduces LUCID semantic codes from a multimodal encoder to retire item IDs in livestreaming rankers, with staged warmup yielding online gains of +0.55% watch duration and +2.05% cold-start views.
-
SAPO: Step-Aligned Policy Optimization for Reasoning-Based Generative Recommendation
SAPO computes per-reasoning-step group-relative advantages in RL to improve credit assignment for structured generation of semantic identifiers in recommendation systems.
-
Conditional Memory Enhanced Item Representation for Generative Recommendation
ComeIR introduces dual-level Engram memory and memory-restoring prediction to reconstruct SID-token embeddings and restore token granularity in generative recommendation.
-
CapsID: Soft-Routed Variable-Length Semantic IDs for Generative Recommendation
CapsID uses probabilistic capsule routing and confidence-based termination to generate variable-length semantic IDs, improving recall by 9.6% over strong baselines with half the latency of dual-representation systems.
-
Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation
PAD-Rec augments standard draft models with item-position and step-position embeddings plus learnable gates, delivering up to 3.1x wall-clock speedup and 5% average gain over strong speculative-decoding baselines on four datasets while largely preserving recommendation quality.
-
From Local Indices to Global Identifiers: Generative Reranking for Recommender Systems via Global Action Space
GloRank reformulates list-wise reranking as token generation over a global item identifier space, using supervised pre-training followed by reinforcement learning to maximize list-wise utility and outperforming baselines on benchmarks and industrial data.
-
Modeling Behavioral Intensity and Transitions for Generative Recommendation
BITRec improves generative multi-behavior recommendation by modeling behavioral intensity via separated pathways and transitions via learnable relation matrices, reporting 15-23% gains on large retail datasets.
-
Birds of a Feather Cluster Nearby: a Proximity-Aware Geo-Codebook for Local Service Recommendation
Pro-GEO introduces a geo-centroid coordinate system and geo-rotary position encoding to model geographic proximity as rotational transformations, enabling balanced semantic-spatial modeling in local service recommendations.
-
MTServe: Efficient Serving for Generative Recommendation Models with Hierarchical Caches
MTServe achieves up to 3.1x speedup for generative recommendation model serving by using hierarchical caches with host RAM and system optimizations while keeping cache hit ratios above 98.5%.
-
IceBreaker for Conversational Agents: Breaking the First-Message Barrier with Personalized Starters
IceBreaker applies resonance-aware interest distillation and interaction-oriented starter generation with preference alignment to create cold-start conversation openers, yielding +0.184% active days and +9.425% CTR gains in production A/B tests.
-
From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines
AuthGR is the first generative retriever to explicitly incorporate document authority alongside relevance using multimodal scoring and progressive training, yielding efficiency gains and real-world engagement improvements.
-
UniRec: Bridging the Expressive Gap between Generative and Discriminative Recommendation via Chain-of-Attribute
UniRec bridges the expressive gap in generative recommendation by prefixing semantic ID sequences with structured attribute tokens, recovering explicit feature crossing and yielding +22.6% HR@50 gains plus online lifts in PVCTR, orders, and GMV.
-
CRAB: Codebook Rebalancing for Bias Mitigation in Generative Recommendation
CRAB mitigates popularity bias in generative recommenders by rebalancing the semantic token codebook through splitting popular tokens and applying a tree-structured regularizer to boost representations for unpopular items.
-
MBGR: Multi-Business Prediction for Generative Recommendation at Meituan
MBGR is a new generative recommendation framework using business-aware semantic IDs, multi-business prediction, and label dynamic routing to handle multiple businesses without seesaw effects or representation confusion, validated by experiments and deployed at Meituan.
-
Towards Efficient and Generalizable Retrieval: Adaptive Semantic Quantization and Residual Knowledge Transfer
SA²CRQ uses sequential adaptive residual quantization based on path entropy plus anchored curriculum regularization from head items to improve both efficiency and cold-start performance in generative retrieval.