A Lyapunov Analysis of Momentum Methods in Optimization

Ashia C. Wilson; Benjamin Recht; Michael I. Jordan

arxiv: 1611.02635 · v4 · pith:MWLEWNCCnew · submitted 2016-11-08 · 🧮 math.OC · cs.DS

A Lyapunov Analysis of Momentum Methods in Optimization

Ashia C. Wilson , Benjamin Recht , Michael I. Jordan This is my paper

classification 🧮 math.OC cs.DS

keywords methodsmomentumalgorithmsestimatesequencestechniqueanalysisconnection

0 comments

read the original abstract

Momentum methods play a significant role in optimization. Examples include Nesterov's accelerated gradient method and the conditional gradient algorithm. Several momentum methods are provably optimal under standard oracle models, and all use a technique called estimate sequences to analyze their convergence properties. The technique of estimate sequences has long been considered difficult to understand, leading many researchers to generate alternative, "more intuitive" methods and analyses. We show there is an equivalence between the technique of estimate sequences and a family of Lyapunov functions in both continuous and discrete time. This connection allows us to develop a simple and unified analysis of many existing momentum algorithms, introduce several new algorithms, and strengthen the connection between algorithms and continuous-time dynamical systems.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Finite-Time Optimization via Scaled Gradient-Momentum Flows
math.OC 2026-04 unverdicted novelty 6.0

A scaled gradient-momentum framework achieves global finite-time convergence by linking gradient-dominance properties of the objective to finite-time stability via state-dependent scaling.
Adaptive Federated Optimization
cs.LG 2020-02 unverdicted novelty 6.0

Proposes federated adaptive optimizers (FedAdagrad, FedAdam, FedYogi) with convergence analysis for non-convex objectives under data heterogeneity and reports empirical gains over FedAvg.