SAPO computes per-reasoning-step group-relative advantages in RL to improve credit assignment for structured generation of semantic identifiers in recommendation systems.
Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
TriAlignGR proposes a triangular multitask alignment framework with cross-modal semantic alignment, deep interest mining via chain-of-thought, and joint training on eight tasks to address content degradation and semantic opacity in Semantic ID-based generative recommendation.
citing papers explorer
-
SAPO: Step-Aligned Policy Optimization for Reasoning-Based Generative Recommendation
SAPO computes per-reasoning-step group-relative advantages in RL to improve credit assignment for structured generation of semantic identifiers in recommendation systems.
-
TriAlignGR: Triangular Multitask Alignment with Multimodal Deep Interest Mining for Generative Recommendation
TriAlignGR proposes a triangular multitask alignment framework with cross-modal semantic alignment, deep interest mining via chain-of-thought, and joint training on eight tasks to address content degradation and semantic opacity in Semantic ID-based generative recommendation.