A Generic Approach for Escaping Saddle points

Alexander J Smola; Barnabas Poczos; Francis Bach; Manzil Zaheer; Ruslan Salakhutdinov; Sashank J Reddi; Suvrit Sra

arxiv: 1709.01434 · v1 · pith:5UPBLBAInew · submitted 2017-09-05 · 💻 cs.LG · cs.AI

A Generic Approach for Escaping Saddle points

Sashank J Reddi , Manzil Zaheer , Suvrit Sra , Barnabas Poczos , Francis Bach , Ruslan Salakhutdinov , Alexander J Smola This is my paper

classification 💻 cs.LG cs.AI

keywords pointsmethodssaddlesecond-orderfirst-orderchallengecomputationsframework

0 comments

read the original abstract

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Accelerated Gradient Methods for Nonconvex Optimization: Escape Trajectories From Strict Saddle Points and Convergence to Local Minima
math.OC 2023-07 unverdicted novelty 7.0

Theoretical analysis of accelerated gradient methods showing almost-sure escape from strict saddles and linear exit times, plus a subclass achieving near-optimal convergence to local minima in convex neighborhoods of ...
Dimension-Free Saddle-Point Escape in Muon
cs.LG 2026-05 unverdicted novelty 6.0

Muon achieves dimension-free saddle-point escape through non-linear spectral shaping, resolvent calculus, and structural incoherence, yielding an algebraically dimension-free escape bound.
Adaptive Federated Optimization
cs.LG 2020-02 unverdicted novelty 6.0

Proposes federated adaptive optimizers (FedAdagrad, FedAdam, FedYogi) with convergence analysis for non-convex objectives under data heterogeneity and reports empirical gains over FedAvg.