Derives α^{-1/3} scaling for generalization error in online softmax classification from boundary layers in a teacher-student model.
Exact solution for on-line learning in multilayer neural networks.Physical Review Letters, 74(21):4337
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Gradient flow in energy-based models for strictly positive binary distributions produces stable data-consistent fixed points and a learning hierarchy that favors lower-order interactions first, mechanistically explaining distributional simplicity bias.
citing papers explorer
-
A Boundary-Layer Mechanism for One-Third Scaling in Online Softmax Classification
Derives α^{-1/3} scaling for generalization error in online softmax classification from boundary layers in a teacher-student model.
-
Distributional simplicity bias and effective convexity in Energy Based Models
Gradient flow in energy-based models for strictly positive binary distributions produces stable data-consistent fixed points and a learning hierarchy that favors lower-order interactions first, mechanistically explaining distributional simplicity bias.