SAPO computes per-reasoning-step group-relative advantages in RL to improve credit assignment for structured generation of semantic identifiers in recommendation systems.
Personalized top-n sequential recommendation via convolutional sequence embedding
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
A simple graph heuristic without training or sequence encoders matches or outperforms trained generative recommenders on 10 of 14 sequential recommendation benchmarks by exploiting local transition and feature shortcuts.
citing papers explorer
-
SAPO: Step-Aligned Policy Optimization for Reasoning-Based Generative Recommendation
SAPO computes per-reasoning-step group-relative advantages in RL to improve credit assignment for structured generation of semantic identifiers in recommendation systems.
-
An Embarrassingly Simple Graph Heuristic Reveals Shortcut-Solvable Benchmarks for Sequential Recommendation
A simple graph heuristic without training or sequence encoders matches or outperforms trained generative recommenders on 10 of 14 sequential recommendation benchmarks by exploiting local transition and feature shortcuts.