Projection heads act as geometric buffers; nonlinear heads induce negative Hessian curvature to escape dimensional collapse while linear heads rely on discrete dynamics and BatchNorm.
Proceedings of the 37th International Conference on Machine Learning (ICML) , pages =
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
The Geometry of Projection Heads: Conditioning, Invariance, and Collapse
Projection heads act as geometric buffers; nonlinear heads induce negative Hessian curvature to escape dimensional collapse while linear heads rely on discrete dynamics and BatchNorm.