Mixed-precision CA-SGD for GLMs on A100 GPUs matches FP32 loss within 0.5% while delivering 5.1-6.8x speedup via a nine-choice finite-precision error recipe.
Collective com- munication: Theory, practice, and experience.Concurrency and Computation: Practice and Experience, 19(13):1749–1783, 2007
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Mixed-Precision Communication-Avoiding SGD for Generalized Linear Models on GPUs
Mixed-precision CA-SGD for GLMs on A100 GPUs matches FP32 loss within 0.5% while delivering 5.1-6.8x speedup via a nine-choice finite-precision error recipe.