A two-level DMFT predicts width-consistent outlier escape and hyperparameter transfer under μP in deep networks, with bulk restructuring dominating for tasks with many outputs.
Fundamental limits in structured principal component analysis and how to reach them.Proceedings of the National Academy of Sciences, 120(30):e2302028120
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cond-mat.dis-nn 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer
A two-level DMFT predicts width-consistent outlier escape and hyperparameter transfer under μP in deep networks, with bulk restructuring dominating for tasks with many outputs.