PowerStep delivers coordinate-wise adaptive optimization by nonlinearly transforming a momentum buffer under an lp-norm steepest-descent geometry, matching Adam convergence with half the memory and supporting aggressive quantization.
Reddi and Devendra Singh Sachan and Satyen Kale and Sanjiv Kumar , title=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
PowerStep: Memory-Efficient Adaptive Optimization via $\ell_p$-Norm Steepest Descent
PowerStep delivers coordinate-wise adaptive optimization by nonlinearly transforming a momentum buffer under an lp-norm steepest-descent geometry, matching Adam convergence with half the memory and supporting aggressive quantization.