Slingshot loss spikes are produced by low-precision arithmetic that breaks the zero-sum gradient constraint and drives exponential growth via Numerical Feature Inflation.
On large-batch training for deep learning: Generalization gap and sharp minima
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 4roles
background 1polarities
background 1representative citing papers
Negative-capable ridge regression uses controlled negative regularization as anti-shrinkage to increase effective complexity along weak eigendirections and mitigate underfitting in small-data regression.
C3PO is a foundation model for bilevel pricing optimization that trains on simulated discrete choice data and retrieves elasticity priors from literature to improve revenue KPIs under business constraints.