pith. machine review for the scientific record. sign in

arxiv: 2412.14291 · v2 · submitted 2024-12-18 · 🧮 math.OC · cs.LG· stat.ML

Recognition: unknown

Projected gradient methods for nonconvex and stochastic smooth optimization: new complexities and auto-conditioned stepsizes

Authors on Pith no claims yet
classification 🧮 math.OC cs.LGstat.ML
keywords gradientstochasticmethodsprojectedauto-conditionedcomplexityconstantlipschitz
0
0 comments X
read the original abstract

We present a novel class of projected gradient (PG) methods for minimizing a smooth but not necessarily convex function over a convex compact set. We first provide a novel analysis of the constant-stepsize PG method, achieving the best-known iteration complexity for finding an approximate stationary point of the problem. We then develop an "auto-conditioned" projected gradient (AC-PG) variant that achieves the same iteration complexity without requiring the input of the Lipschitz constant of the gradient or any line search procedure. The key idea is to estimate the Lipschitz constant using first-order information gathered from the previous iterations, and to show that the error caused by underestimating the Lipschitz constant can be properly controlled. We then generalize the PG methods to the stochastic setting, by proposing a stochastic projected gradient (SPG) method and a variance-reduced stochastic gradient (VR-SPG) method, achieving new complexity bounds in different oracle settings. We also present auto-conditioned stepsize policies for both stochastic PG methods and establish comparable convergence guarantees.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Stochastic Auto-conditioned Fast Gradient Methods with Optimal Rates

    math.OC 2026-04 unverdicted novelty 8.0

    Stochastic AC-FGM achieves optimal O(1/√ε) iteration complexity and O(1/ε²) sample complexity while being fully adaptive to smoothness, horizon, and noise under bounded conditional variance.

  2. Adaptive Newton-CG methods with global and local analysis for unconstrained optimization with H\"older continuous Hessian

    math.OC 2026-04 unverdicted novelty 7.0

    Adaptive Newton-CG methods achieve the best-known iteration complexity for epsilon-stationary points in nonconvex optimization with Holder continuous Hessians while ensuring local superlinear convergence.