A theory shows SGD accumulates coherent signal via linear drift in NTK signal directions while trapping noise in orthogonal low-eigenvalue dimensions, enabling generalization even under O(1) kernel evolution and yielding an exact population-risk objective from one run that acts as an Adam SNR boost.
Dennis and Weisberg, Sanford , title =
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Theory of Generalization in Deep Learning
A theory shows SGD accumulates coherent signal via linear drift in NTK signal directions while trapping noise in orthogonal low-eigenvalue dimensions, enabling generalization even under O(1) kernel evolution and yielding an exact population-risk objective from one run that acts as an Adam SNR boost.