DPS quantifies deviation of per-sample decision patterns from class averages and shows linear correlation with generalization gaps while unifying degradation scenarios into a continuous trajectory.
On large-batch training for deep learning: Generalization gap and sharp minima
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
SGD stabilizes sharpness below 2/η with equilibrium gap ΔS = η β σ_u²/(4α) due to noise-enhanced stochastic self-stabilization.
citing papers explorer
-
Understanding Generalization through Decision Pattern Shift
DPS quantifies deviation of per-sample decision patterns from class averages and shows linear correlation with generalization gaps while unifying degradation scenarios into a continuous trajectory.
-
SGD at the Edge of Stability: The Stochastic Sharpness Gap
SGD stabilizes sharpness below 2/η with equilibrium gap ΔS = η β σ_u²/(4α) due to noise-enhanced stochastic self-stabilization.