SID-MLP distills autoregressive generative recommenders into efficient position-specific MLP heads for Semantic ID tasks, achieving 8.74x faster inference with matching accuracy.
hub
Unifying generative and dense retrieval for sequential recommendation
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3representative citing papers
Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.
GenRecEdit injects cold-start items into generative recommendation models via context-aware token editing and interference-reducing triggers, boosting cold-start accuracy while using only 9.5% of retraining time.
ComeIR introduces dual-level Engram memory and memory-restoring prediction to reconstruct SID-token embeddings and restore token granularity in generative recommendation.
BLUE aligns LLM-generated textual user profiles with embedding-based recommendation objectives via reinforcement learning and next-item text supervision, yielding better zero-shot performance and cross-domain transfer than baselines.
CapsID uses probabilistic capsule routing and confidence-based termination to generate variable-length semantic IDs, improving recall by 9.6% over strong baselines with half the latency of dual-representation systems.
MTServe achieves up to 3.1x speedup for generative recommendation model serving by using hierarchical caches with host RAM and system optimizations while keeping cache hit ratios above 98.5%.
LWGR applies personalized soft instructions for LLM knowledge extraction and Lagrangian primal-dual optimization to selectively fuse beneficial world knowledge into generative recommendation while bounding degradation.
GenPAS unifies common data augmentation strategies for generative recommendation as special cases of a bias-controlled stochastic sampling process and demonstrates gains in accuracy, data efficiency, and parameter efficiency on benchmarks and industrial data.
A model-agnostic SID alignment update mitigates staleness from temporal drift in user-item interactions for generative retrievers, improving Recall@K and nDCG@K while reducing compute by 8-9x versus full retraining.
citing papers explorer
-
MLPs are Efficient Distilled Generative Recommenders
SID-MLP distills autoregressive generative recommenders into efficient position-specific MLP heads for Semantic ID tasks, achieving 8.74x faster inference with matching accuracy.
-
Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation
Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.
-
GenRecEdit: Adapting Model Editing for Generative Recommendation with Cold-Start Items
GenRecEdit injects cold-start items into generative recommendation models via context-aware token editing and interference-reducing triggers, boosting cold-start accuracy while using only 9.5% of retraining time.
-
Conditional Memory Enhanced Item Representation for Generative Recommendation
ComeIR introduces dual-level Engram memory and memory-restoring prediction to reconstruct SID-token embeddings and restore token granularity in generative recommendation.
-
Bridging Textual Profiles and Latent User Embeddings for Personalization
BLUE aligns LLM-generated textual user profiles with embedding-based recommendation objectives via reinforcement learning and next-item text supervision, yielding better zero-shot performance and cross-domain transfer than baselines.
-
CapsID: Soft-Routed Variable-Length Semantic IDs for Generative Recommendation
CapsID uses probabilistic capsule routing and confidence-based termination to generate variable-length semantic IDs, improving recall by 9.6% over strong baselines with half the latency of dual-representation systems.
-
MTServe: Efficient Serving for Generative Recommendation Models with Hierarchical Caches
MTServe achieves up to 3.1x speedup for generative recommendation model serving by using hierarchical caches with host RAM and system optimizations while keeping cache hit ratios above 98.5%.
-
LWGR: Lagrangian-Constrained Personalized World Knowledge for Generative Recommendation
LWGR applies personalized soft instructions for LLM knowledge extraction and Lagrangian primal-dual optimization to selectively fuse beneficial world knowledge into generative recommendation while bounding degradation.
-
Sequential Data Augmentation for Generative Recommendation
GenPAS unifies common data augmentation strategies for generative recommendation as special cases of a bias-controlled stochastic sampling process and demonstrates gains in accuracy, data efficiency, and parameter efficiency on benchmarks and industrial data.
-
Mitigating Collaborative Semantic ID Staleness in Generative Retrieval
A model-agnostic SID alignment update mitigates staleness from temporal drift in user-item interactions for generative retrievers, improving Recall@K and nDCG@K while reducing compute by 8-9x versus full retraining.