Scion is a new stochastic LMO-based optimizer family that unifies existing methods, supports unconstrained problems, and delivers hyperparameter transferability plus speedups on nanoGPT training.
Advances in neural information processing systems , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Optimizer choice induces distinct connected regions in the loss landscape of two-layer ReLU networks, with AdamW and Muon sometimes separated by provable barriers.
citing papers explorer
-
Training Deep Learning Models with Norm-Constrained LMOs
Scion is a new stochastic LMO-based optimizer family that unifies existing methods, supports unconstrained problems, and delivers hyperparameter transferability plus speedups on nanoGPT training.
-
Optimizer-Induced Mode Connectivity: From AdamW to Muon
Optimizer choice induces distinct connected regions in the loss landscape of two-layer ReLU networks, with AdamW and Muon sometimes separated by provable barriers.