Gradient Descent Converges to Minimizers

Benjamin Recht; Jason D. Lee; Max Simchowitz; Michael I. Jordan

arxiv: 1602.04915 · v2 · pith:2RGH7RJDnew · submitted 2016-02-16 · 📊 stat.ML · cs.LG· math.OC

Gradient Descent Converges to Minimizers

Jason D. Lee , Max Simchowitz , Michael I. Jordan , Benjamin Recht This is my paper

classification 📊 stat.ML cs.LGmath.OC

keywords convergesdescentgradientalmostapplyingdynamicalinitializationlocal

0 comments

read the original abstract

We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Accelerated Gradient Methods for Nonconvex Optimization: Escape Trajectories From Strict Saddle Points and Convergence to Local Minima
math.OC 2023-07 unverdicted novelty 7.0

Theoretical analysis of accelerated gradient methods showing almost-sure escape from strict saddles and linear exit times, plus a subclass achieving near-optimal convergence to local minima in convex neighborhoods of ...
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
cs.LG 2019-07 conditional novelty 7.0

Permutation symmetries generate permutation saddles and equal-loss valleys linking equivalent global minima, yielding a lower bound on symmetry-induced critical points.
Combining Stochastic Adaptive Cubic Regularization with Negative Curvature for Nonconvex Optimization
math.OC 2019-06 unverdicted novelty 7.0

Introduces the SANC algorithm combining negative curvature with stochastic adaptive cubic regularization for nonconvex optimization and claims it is the first such combination with consistent batch sizes for large-scale ML.