BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.
Ini- tialization and regularization of factorized neural layers
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
DLR augments low-rank factorization with a fixed structured residual during training that is absorbed post-training, improving C4 perplexity for LLaMA models from 60M to 7B while preserving exact low-rank inference cost.
citing papers explorer
-
BOOST: BOttleneck-Optimized Scalable Training Framework for Low-Rank Large Language Models
BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.