TurboGR trains up to 0.2B-parameter generative recommendation models on Ascend NPUs at 54.71% MFU with 0.97 near-linear scalability via jagged acceleration, hierarchical parallelism, and negative sampling optimizations.
D 6-Batch Pipelined Overlapping Execution In this appendix we present the algorithm of the fine-grained pipeline orchestration
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TurboGR: An Accelerated Training System for Large-Scale Generative Recommendation
TurboGR trains up to 0.2B-parameter generative recommendation models on Ascend NPUs at 54.71% MFU with 0.97 near-linear scalability via jagged acceleration, hierarchical parallelism, and negative sampling optimizations.