A two-level DMFT tracks bulk and outlier spectral dynamics in wide networks, predicting width-consistent outlier growth and hyperparameter transfer under muP scaling for deep linear nets while noting bulk restructuring for large-output tasks.
The nuclear route: Sharp asymptotics of erm in overparameterized quadratic networks.Advances in Neural Information Processing Systems, 38:88862–88901
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cond-mat.dis-nn 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer
A two-level DMFT tracks bulk and outlier spectral dynamics in wide networks, predicting width-consistent outlier growth and hyperparameter transfer under muP scaling for deep linear nets while noting bulk restructuring for large-output tasks.