hub Canonical reference

OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment

Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding · 2025 · cs.IR · arXiv 2502.18965

Canonical reference. 75% of citing Pith papers cite this work as background.

82 Pith papers citing it

Background 75% of classified citations

open full Pith review browse 82 citing papers arXiv PDF

abstract

Recently, generative retrieval-based recommendation systems have emerged as a promising paradigm. However, most modern recommender systems adopt a retrieve-and-rank strategy, where the generative model functions only as a selector during the retrieval stage. In this paper, we propose OneRec, which replaces the cascaded learning framework with a unified generative model. To the best of our knowledge, this is the first end-to-end generative model that significantly surpasses current complex and well-designed recommender systems in real-world scenarios. Specifically, OneRec includes: 1) an encoder-decoder structure, which encodes the user's historical behavior sequences and gradually decodes the videos that the user may be interested in. We adopt sparse Mixture-of-Experts (MoE) to scale model capacity without proportionally increasing computational FLOPs. 2) a session-wise generation approach. In contrast to traditional next-item prediction, we propose a session-wise generation, which is more elegant and contextually coherent than point-by-point generation that relies on hand-crafted rules to properly combine the generated results. 3) an Iterative Preference Alignment module combined with Direct Preference Optimization (DPO) to enhance the quality of the generated results. Unlike DPO in NLP, a recommendation system typically has only one opportunity to display results for each user's browsing request, making it impossible to obtain positive and negative samples simultaneously. To address this limitation, We design a reward model to simulate user generation and customize the sampling strategy. Extensive experiments have demonstrated that a limited number of DPO samples can align user interest preferences and significantly improve the quality of generated results. We deployed OneRec in the main scene of Kuaishou, achieving a 1.6\% increase in watch-time, which is a substantial improvement.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 7 dataset 1

citation-polarity summary

background 6 support 1 use dataset 1

representative citing papers

KuaiLive: A Real-time Interactive Dataset for Live Streaming Recommendation

cs.IR · 2025-08-07 · accept · novelty 8.0

KuaiLive is the first publicly released real-time interactive dataset for live streaming recommendation, with logs from 23,772 users and 452,621 streamers over 21 days plus timestamps, multi-type interactions, and side features.

OneRetrieval: Unifying Multi-Branch E-commerce Retrieval with an Editable Generative Model

cs.IR · 2026-06-11 · unverdicted · novelty 7.0

OneRetrieval unifies multi-branch e-commerce retrieval into a single editable generative model using keyword-aligned encoding and information-theoretic codebook grouping.

TRACER: Token ReAssignment for Concept ERasure in Generative Recommendation

cs.IR · 2026-06-05 · unverdicted · novelty 7.0

TRACER uses token reassignment for concept-related items plus a coherence regularizer to unlearn specific concepts in generative recommendation while preserving utility better than baselines.

LLMs Need Encoders for Semantic IDs Too

cs.IR · 2026-05-29 · unverdicted · novelty 7.0

PrefixMem encoder for Semantic IDs improves deepest-level accuracy by up to 46% relative and full-SID retrieval recall by up to 22% relative on Pinterest data across LLM families.

From Item-Only to Query-Item: Query-Conditioned Generative Search with QGS in Quark

cs.IR · 2026-05-25 · unverdicted · novelty 7.0

QGS introduces query-item pair encoding and query-conditioned prediction with a linear HSTU encoder and HFG-Attention to reduce noise from query switches in generative search ranking, reporting online gains in a commercial system.

How Reliable Are Semantic-ID Tokenizer Comparisons in Generative Recommendation?

cs.IR · 2026-05-25 · conditional · novelty 7.0

Semantic-ID tokenizers produce collisions affecting up to 30.5% of items across four datasets, inflating Hit@10 by up to 103.36% and making prior tokenizer comparisons unreliable.

Selective Test-Time Compute Scaling for Click-Through Rate Prediction via Uncertainty-Triggered Feature Path Exploration

cs.LG · 2026-05-24 · unverdicted · novelty 7.0

UTTSI selectively scales test-time compute for CTR prediction by triggering stochastic feature-path exploration only on high-uncertainty instances, yielding gains on four datasets and a 5.3% online CTR lift.

Generative Conversational Recommender System

cs.IR · 2026-05-21 · unverdicted · novelty 7.0

A single autoregressive model for conversational recommendation that uses semantic item IDs, predicts response intent and target first, then generates the response, reporting up to 29% Recall@1 gains.

Learning Variable-Length Tokenization for Generative Recommendation

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

VarLenRec learns variable-length semantic IDs for generative recommendation by allocating longer codes to tail items via popularity-weighted information budget allocation, hyperbolic residual quantization, and a differentiable soft length controller.

Asymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization

cs.IR · 2026-05-14 · unverdicted · novelty 7.0

AsymRec decouples input and output representations in generative recommendation via multi-expert semantic projection and multi-faceted hierarchical quantization, outperforming prior models by 15.8% on average.

MLPs are Efficient Distilled Generative Recommenders

cs.IR · 2026-05-12 · unverdicted · novelty 7.0

SID-MLP distills autoregressive generative recommenders into efficient position-specific MLP heads for Semantic ID tasks, achieving 8.74x faster inference with matching accuracy.

Why Users Go There: World Knowledge-Augmented Generative Next POI Recommendation

cs.AI · 2026-05-12 · unverdicted · novelty 7.0

AWARE augments generative next-POI recommendation with LLM agents that produce user-anchored narratives capturing events, culture, and trends, delivering up to 12.4% relative gains on three real datasets.

Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation

cs.IR · 2026-05-07 · unverdicted · novelty 7.0

Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.

Limitations of LTI Koopman Modeling for Nonlinear Control Systems

math.OC · 2026-04-28 · unverdicted · novelty 7.0

Exact LTI Koopman models for nonlinear control systems require affine linear dynamics under controllability and coordinate inclusion assumptions.

Green-Red Watermarking for Recommender Systems

cs.IR · 2026-04-26 · unverdicted · novelty 7.0

GREW uses a secret-key-driven green-red item partition and three ranking-integrated modules to embed verifiable watermarks in recommender systems that resist extraction attacks without data injection.

Objective Shaping with Hard Negatives: Windowed Partial AUC Optimization for RL-based LLM Recommenders

cs.IR · 2026-04-24 · unverdicted · novelty 7.0

Beam-search negatives induce partial AUC optimization in GRPO for LLM recommenders; Windowed Partial AUC and TAWin improve Top-K alignment on four datasets.

ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression

cs.IR · 2026-04-24 · conditional · novelty 7.0

ResRank unifies retrieval and listwise reranking by compressing passages to one token each, using residual connections and cosine-similarity scoring, achieving competitive effectiveness on TREC DL and BEIR benchmarks with zero generated tokens.

On the Equivalence Between Auto-Regressive Next Token Prediction and Full-Item-Vocabulary Maximum Likelihood Estimation in Generative Recommendation--A Short Note

cs.IR · 2026-04-17 · accept · novelty 7.0

Auto-regressive next-token prediction is strictly equivalent to full-vocabulary maximum likelihood estimation in generative recommendation under bijective item-to-token-sequence mapping.

DUET: Joint Exploration of User Item Profiles in Recommendation System

cs.IR · 2026-04-15 · unverdicted · novelty 7.0

DUET uses a three-stage joint profile generator with RL feedback to create consistent user-item textual profiles that outperform independent generation in recommendation tasks.

IAT: Instance-As-Token Compression for Historical User Sequence Modeling in Industrial Recommender Systems

cs.IR · 2026-04-10 · unverdicted · novelty 7.0

IAT compresses each historical interaction instance into a unified embedding token via temporal-order or user-order schemes, allowing standard sequence models to learn long-range preferences with better performance and transferability.

From Passive Feeds to Guided Discovery: AI-Initiated Interaction for Vague Intent in Content Exploration

cs.HC · 2026-03-30 · conditional · novelty 7.0

Red-Rec uses AI-initiated summaries and low-effort option selection to help users with vague intent explore more broadly and with higher serendipity than user-initiated chat while requiring less typing.

GenRecEdit: Adapting Model Editing for Generative Recommendation with Cold-Start Items

cs.IR · 2026-03-15 · conditional · novelty 7.0

GenRecEdit injects cold-start items into generative recommendation models via context-aware token editing and interference-reducing triggers, boosting cold-start accuracy while using only 9.5% of retraining time.

RAD-DPO: Robust Adaptive Denoising Direct Preference Optimization for Generative Retrieval in E-commerce

cs.IR · 2026-02-27 · unverdicted · novelty 7.0

RAD-DPO adds token-level gradient detachment, similarity-based dynamic reward weighting, and a multi-label global contrastive objective to DPO for better handling of hierarchical Semantic IDs and noisy feedback in e-commerce generative retrieval.

Compute Only Once: UG-Separation for Efficient Large Recommendation Models

cs.IR · 2026-02-11 · unverdicted · novelty 7.0

UG-Separation framework disentangles user-side and item-side flows in TokenMixer dense-interaction models to enable reusable user computations, cutting inference latency up to 20% in ByteDance production scenarios.

citing papers explorer

Showing 32 of 82 citing papers.

SCASRec: A Self-Correcting and Auto-Stopping Model for Generative Route List Recommendation cs.IR · 2026-02-03 · unverdicted · none · ref 15 · internal anchor
SCASRec unifies ranking and redundancy elimination for route lists via stepwise corrective rewards and an adaptive end-of-recommendation token, claiming SOTA results on two datasets and real deployment.
A Survey on Generative Recommendation: Data, Model, and Tasks cs.IR · 2025-10-31 · accept · none · ref 24 · internal anchor
This survey organizes generative recommendation into data, model, and task dimensions, identifying five advantages including world knowledge integration and creative generation while noting challenges in benchmarks and efficiency.
Bi-Level Optimization for Generative Recommendation: Bridging Tokenization and Generation cs.IR · 2025-10-24 · unverdicted · none · ref 8 · internal anchor
BLOGER is a bi-level optimization framework that jointly optimizes the tokenizer and recommender for generative recommendation, outperforming prior methods on real-world datasets.
Next Interest Flow: A Generative Pre-training Paradigm for Recommender Systems by Modeling All-domain Movelines cs.IR · 2025-10-13 · unverdicted · none · ref 5 · internal anchor
Next Interest Flow models user intent as continuous evolutionary trajectories on a high-dimensional latent interest manifold with kinematic constraints, bidirectional alignment, and temporal causality mechanisms, yielding reported gains on industrial CTR data.
Sequential Data Augmentation for Generative Recommendation cs.LG · 2025-09-17 · conditional · none · ref 13 · internal anchor
GenPAS unifies common data augmentation strategies for generative recommendation as special cases of a bias-controlled stochastic sampling process and demonstrates gains in accuracy, data efficiency, and parameter efficiency on benchmarks and industrial data.
Mirroring Users: Towards Building Preference-aligned User Simulator with User Feedback in Recommendation cs.HC · 2025-08-25 · unverdicted · none · ref 8 · internal anchor
A two-phase data construction framework generates explanatory rationales from user feedback and applies uncertainty-based distillation to fine-tune lightweight LLMs as preference-aligned user simulators for recommender systems.
Generative Bid Shading in Real-Time Bidding Advertising cs.GT · 2025-08-06 · unverdicted · none · ref 8 · internal anchor
GBS replaces two-stage bid landscape modeling with an autoregressive generative model plus reward-aligned policy optimization to improve short- and long-term advertiser surplus in real-time bidding.
GR2 Technical Report cs.IR · 2026-06-30 · unverdicted · none · ref 2 · internal anchor
GR2 applies mid-training on semantic IDs, reasoning distillation, RL with conditional verifiable rewards, and a context compressor to re-ranking in industrial recsys, reporting +18.7% R@1 over baselines.
Adaptive Loss Balancing for Noise-Robust GRPO in Generative Recommendation cs.LG · 2026-06-07 · unverdicted · none · ref 5 · internal anchor
AdaGRPO gates GRPO reinforcement learning with supervised NLL using per-sample binary clips based on policy difficulty and reward discriminability, raising HR@10 from 11.01% to 12.18% while keeping hallucination below 0.22% on large-scale e-commerce data and showing A/B gains.
SSRLive: Live Streaming Recommendation with Dynamic Semantic ID cs.IR · 2026-06-05 · unverdicted · none · ref 8 · internal anchor
SSRLive combines generative and discriminative modules with dynamic semantic IDs to improve live streaming recommendations, reporting gains of +3.38% watch time, +0.72% GMV, +3.12% follower growth, and +2.92% interaction volume in online A/B tests.
Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization cs.CL · 2026-06-01 · unverdicted · none · ref 108 · internal anchor
PHF applies Bourdieu's Theory of Practice to create hierarchical user models for LLM personalization and reports consistent gains on the LaMP benchmark.
Towards Sustainable Growth: A Multi-Value-Aware Retrieval Framework for E-Commerce Search cs.IR · 2026-05-18 · unverdicted · none · ref 4 · internal anchor
GrowthGR combines ItemLTV counterfactual prediction with MultiGR generative retrieval and MoPO optimization to deliver 5.3% new item GMV lift and 0.3% overall GMV gain on Taobao production.
Discrimination Is Generation: Unifying Ranking and Retrieval from a Tokenizer Perspective cs.IR · 2026-05-14 · unverdicted · none · ref 1 · internal anchor
DIG unifies ranking and retrieval by training the tokenizer jointly inside a ranking model, producing improved models for both from a single run.
Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL cs.IR · 2026-05-14 · unverdicted · none · ref 5 · internal anchor
CQ-SID semantic IDs and EG-GRPO RL improve generative retrieval hit rates up to 26.76% over RQ-VAE baselines and deliver +1.15% GMV in live e-commerce A/B tests.
UxSID: Semantic-Aware User Interests Modeling for Ultra-Long Sequence cs.AI · 2026-05-09 · unverdicted · none · ref 39 · 3 links · internal anchor
UxSID models ultra-long user sequences with semantic-group shared interest memory using Semantic IDs and dual-level attention, achieving state-of-the-art performance and a 0.337% revenue lift in advertising A/B tests.
Revisiting General Map Search via Generative Point-of-Interest Retrieval cs.IR · 2026-05-05 · unverdicted · none · ref 15 · internal anchor
GenPOI is a generative POI retrieval system that unifies heterogeneous contexts via LLMs, uses geo-semantic tokenization, and applies proximity constraints to achieve superior performance on large-scale map search data.
Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations cs.AI · 2026-04-29 · unverdicted · none · ref 12 · 2 links · internal anchor
Bian Que is an agentic framework using a unified operational paradigm, flexible Skill Arrangement, and self-evolving mechanism to automate O&M tasks, achieving 75% alert reduction and over 50% MTTR cut in production deployment.
Harmonizing Generative Retrieval and Ranking in Chain-of-Recommendation cs.IR · 2026-04-28 · unverdicted · none · ref 2 · internal anchor
RecoChain unifies generative candidate generation via hierarchical semantic IDs and SIM-based ranking in a single Transformer to improve top-K recommendation performance.
Mitigating Collaborative Semantic ID Staleness in Generative Retrieval cs.IR · 2026-04-14 · unverdicted · none · ref 5 · internal anchor
A model-agnostic SID alignment update mitigates staleness from temporal drift in user-item interactions for generative retrievers, improving Recall@K and nDCG@K while reducing compute by 8-9x versus full retraining.
SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search cs.IR · 2026-04-12 · unverdicted · none · ref 3 · internal anchor
SID-Coord coordinates semantic IDs with hashed item IDs via attention fusion, adaptive gating, and interest alignment, yielding +0.664% long-play rate and +0.369% playback duration gains in production search ranking.
SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress cs.IR · 2026-02-26 · unverdicted · none · ref 2 · internal anchor
SIGMA deploys a semantic-grounded, instruction-driven generative model with hybrid tokenization and adaptive fusion for multi-task recommendation at AliExpress.
Denoising Neural Reranker for Recommender Systems cs.IR · 2025-09-23 · unverdicted · none · ref 4 · internal anchor
DNR is an adversarial denoising neural reranker that extends score error minimization with three objectives to denoise retriever scores and align them with user feedback in two-stage recommender systems.
Learning Decomposed Contextual Token Representations from Pretrained and Collaborative Signals for Generative Recommendation cs.IR · 2025-08-22 · unverdicted · none · ref 7 · internal anchor
DECOR learns decomposed contextual token representations by combining pretrained semantics with collaborative signals to fix objective misalignment in two-stage generative recommendation systems.
CMSL: Constructive Multi-Sequence Learning for Recommendation Systems cs.IR · 2026-06-26 · unverdicted · none · ref 109 · internal anchor
CMSL uses a learnable module to disentangle user history into multiple pure sequences modeled with linear attention to improve recommendation performance over single-sequence approaches.
Structuring and Tokenizing Distributed User Interest Context for Generative Recommendation cs.IR · 2026-06-18 · unverdicted · none · ref 4 · internal anchor
G2Rec unifies holistic graph-based user co-engagement modeling with semantic tokenization for scalable generative recommendation without ground-truth user interests.
DSIRM: Learning Query-Bridged Discrete Semantic Identifiers for E-commerce Relevance Modeling cs.IR · 2026-06-03 · unverdicted · none · ref 6 · internal anchor
DSIRM uses query-bridged contrastive quantization and generative LLMs to create relevance-aware discrete semantic identifiers, reporting +1.54% offline AUC and online lifts on Tmall production data.
Taiji: Pareto Optimal Policy Optimization with Semantics-IDs Trade-off for Industrial LLM-Enhanced Recommendation cs.IR · 2026-06-02 · unverdicted · none · ref 4 · internal anchor
Taiji presents a LLM-as-Enhancer system with reverse-engineered CoT data generation and Pareto Optimal Policy Optimization (POPO) to trade off semantic and ID rewards, deployed at Kuaishou serving 400M daily users.
MuChator: Enabling Active Music Discovery via Conversational Music LLMs in Douyin Music cs.IR · 2026-05-26 · unverdicted · none · ref 8 · internal anchor
MuChator introduces a three-component MusicLLM system (staged knowledge pre-training, automated triplet instruction tuning, hybrid RM with GRPO) that outperforms Gemini-3-Pro on internal datasets and yields 46.49% higher user active days after deployment on Douyin Music.
RecGPT-Mobile: On-Device Large Language Models for User Intent Understanding in Taobao Feed Recommendation cs.IR · 2026-05-06 · unverdicted · none · ref 4 · internal anchor
RecGPT-Mobile runs a compact LLM on phones to understand evolving user intent from behaviors and improve mobile e-commerce recommendations.
OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework cs.IR · 2026-03-25 · unverdicted · none · ref 6 · internal anchor
OneSearch-V2 improves generative retrieval via latent reasoning and self-distillation, achieving +3.98% item CTR, +2.07% buyer volume, and +2.11% order volume in online A/B tests.
Joint Model Parameter Scaling and Universal-Domain Data Integration for E-commerce Search Ranking cs.IR · 2026-03-25 · unverdicted · none · ref 7 · internal anchor
UniScale couples entire-space data construction with a hierarchical fusion transformer to improve scaling behavior and deliver 1.70% purchase and 2.04% GMV lifts in large-scale e-commerce search A/B tests.
Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback cs.IR · 2026-05-27 · unverdicted · none · ref 6 · internal anchor
Advocates prioritizing explicit contextual feedback in LLM-based recommender systems to improve user preference alignment and explainability.

OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer