Training at the edge of stability causes neural network optimizers to converge on fractal attractors whose effective dimension, measured via a new sharpness dimension from the Hessian spectrum, bounds generalization error in a way not captured by prior trace or norm measures.
arXiv preprint arXiv:2507.06775 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
The survey unifies extensions of PAC-Bayesian theory to data-dependent sets, geometric and topological complexity measures of optimization trajectories, and stability replacements for information terms into one template inequality with comparative evaluation.
citing papers explorer
-
Generalization at the Edge of Stability
Training at the edge of stability causes neural network optimizers to converge on fractal attractors whose effective dimension, measured via a new sharpness dimension from the Hessian spectrum, bounds generalization error in a way not captured by prior trace or norm measures.
-
A Survey on Data-Dependent Worst-Case Generalization Bounds
The survey unifies extensions of PAC-Bayesian theory to data-dependent sets, geometric and topological complexity measures of optimization trajectories, and stability replacements for information terms into one template inequality with comparative evaluation.