BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
Muon+ adds one normalization step after polar orthogonalization in the Muon optimizer, yielding lower training and validation perplexity and faster pre-training across 60M-7B models.
citing papers explorer
-
BOOST: BOttleneck-Optimized Scalable Training Framework for Low-Rank Large Language Models
BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.
-
MUON+: Towards More Effective Muon via One Additional Normalization Step for LLM Pre-training
Muon+ adds one normalization step after polar orthogonalization in the Muon optimizer, yielding lower training and validation perplexity and faster pre-training across 60M-7B models.