Derives ODE deterministic equivalents and an adversarial homogenized SDE for SGD iterates in high-dim ℓ2-adversarial training, showing no constant learning rate ensures monotone descent for single-class adversarial least squares and equivalence to adaptive regularized standard SGD.
Homogenization of sgd in high-dimensions: Exact dynamics and generalization properties
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Overparameterization adds symmetries that precondition the Hessian for better minima and increase the probability mass of global minima near typical initializations.
citing papers explorer
-
Homogenization of $\ell_2$-Adversarial Training in High-Dimensions: Exact Dynamics under Stochastic Gradient Descent
Derives ODE deterministic equivalents and an adversarial homogenized SDE for SGD iterates in high-dim ℓ2-adversarial training, showing no constant learning rate ensures monotone descent for single-class adversarial least squares and equivalence to adaptive regularized standard SGD.
-
The Role of Symmetry in Optimizing Overparameterized Networks
Overparameterization adds symmetries that precondition the Hessian for better minima and increase the probability mass of global minima near typical initializations.