For orthogonal inputs, gradient flow on shallow ReLU nets with MSE loss at small init converges to zero loss, exhibits min-variation-norm bias, initial alignment, and saddle-to-saddle dynamics.
Trainability and accuracy of artificial neural networks: An interacting particle system approach
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2022 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
For orthogonal inputs, gradient flow on shallow ReLU nets with MSE loss at small init converges to zero loss, exhibits min-variation-norm bias, initial alignment, and saddle-to-saddle dynamics.