A convergence theory for deep learning via over-parameterization

Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song · 2019

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Conservation Law Breaking at the Edge of Stability: A Spectral Theory of Non-Convex Neural Network Optimization

cs.LG · 2026-04-08 · unverdicted · novelty 7.0

Discrete gradient descent breaks L-1 conservation laws in ReLU networks with drift eta^alpha, decomposed exactly as eta^2 times a spectral sum S(eta) whose mode coefficients are proportional to initial error squared times Hessian eigenvalues.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Conservation Law Breaking at the Edge of Stability: A Spectral Theory of Non-Convex Neural Network Optimization cs.LG · 2026-04-08 · unverdicted · none · ref 1
Discrete gradient descent breaks L-1 conservation laws in ReLU networks with drift eta^alpha, decomposed exactly as eta^2 times a spectral sum S(eta) whose mode coefficients are proportional to initial error squared times Hessian eigenvalues.

A convergence theory for deep learning via over-parameterization

fields

years

verdicts

representative citing papers

citing papers explorer