Neural loss landscapes contain flat channels to infinity along which gradient flow leads pairs of neurons to implement gated linear units.
(35) Note that it is necessary to have the ω2 · x contribution to be O(ϵ2), otherwise the α2(ω2 · x) term would diverge with 1/ϵ
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Flat Channels to Infinity in Neural Loss Landscapes
Neural loss landscapes contain flat channels to infinity along which gradient flow leads pairs of neurons to implement gated linear units.