The loss landscape of overparameterized neural networks

· 2018 · cs.LG · arXiv 1804.10200

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

We explore some mathematical features of the loss landscape of overparameterized neural networks. A priori one might imagine that the loss function looks like a typical function from $\mathbb{R}^n$ to $\mathbb{R}$ - in particular, nonconvex, with discrete global minima. In this paper, we prove that in at least one important way, the loss function of an overparameterized neural network does not look like a typical function. If a neural net has $n$ parameters and is trained on $d$ data points, with $n>d$, we show that the locus $M$ of global minima of $L$ is usually not discrete, but rather an $n-d$ dimensional submanifold of $\mathbb{R}^n$. In practice, neural nets commonly have orders of magnitude more parameters than data points, so this observation implies that $M$ is typically a very high-dimensional subset of $\mathbb{R}^n$.

representative citing papers

Flat Channels to Infinity in Neural Loss Landscapes

cs.LG · 2025-06-17 · unverdicted · novelty 7.0

Neural loss landscapes contain flat channels to infinity along which gradient flow leads pairs of neurons to implement gated linear units.

citing papers explorer

Showing 1 of 1 citing paper.

Flat Channels to Infinity in Neural Loss Landscapes cs.LG · 2025-06-17 · unverdicted · none · ref 29 · internal anchor
Neural loss landscapes contain flat channels to infinity along which gradient flow leads pairs of neurons to implement gated linear units.

The loss landscape of overparameterized neural networks

fields

years

verdicts

representative citing papers

citing papers explorer