Non-uniform learning rates correspond to a Stackelberg reformulation of the training objective whose two-time-scale alternating gradient descent yields finite-time convergence and can accelerate training through stronger optimization structure and sharper early curvature.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Rethinking Neural Network Learning Rates: A Stackelberg Perspective
Non-uniform learning rates correspond to a Stackelberg reformulation of the training objective whose two-time-scale alternating gradient descent yields finite-time convergence and can accelerate training through stronger optimization structure and sharper early curvature.