pith. machine review for the scientific record. sign in

arxiv: 1503.01243 · v2 · submitted 2015-03-04 · 📊 stat.ML · math.CA· math.OC

Recognition: unknown

A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights

Authors on Pith no claims yet
classification 📊 stat.ML math.CAmath.OC
keywords nesterovschemeaccelerateddifferentialequationgradientmethodalgorithm
0
0 comments X
read the original abstract

We derive a second-order ordinary differential equation (ODE) which is the limit of Nesterov's accelerated gradient method. This ODE exhibits approximate equivalence to Nesterov's scheme and thus can serve as a tool for analysis. We show that the continuous time ODE allows for a better understanding of Nesterov's scheme. As a byproduct, we obtain a family of schemes with similar convergence rates. The ODE interpretation also suggests restarting Nesterov's scheme leading to an algorithm, which can be rigorously proven to converge at a linear rate whenever the objective is strongly convex.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Momentum Further Constrains Sharpness at the Edge of Stochastic Stability

    cs.LG 2026-04 unverdicted novelty 7.0

    Momentum SGD exhibits two distinct EoSS regimes for batch sharpness, stabilizing at 2(1-β)/η for small batches and 2(1+β)/η for large batches, aligning with linear stability thresholds.