Convergence of adam for non-convex objectives: Relaxed hyperparameters and non-ergodic case

Convergence of adam for non-convex objectives: Relaxed hyperparameters, non-ergodic case , author= · 2023 · arXiv 2307.11782

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

On the Convergence of Muon and Beyond

cs.LG · 2025-09-19 · unverdicted · novelty 7.0

Muon-MVR2 attains the optimal anytime convergence rate of ~O(T^{-1/3}) in stochastic non-convex settings under horizon-free schedules.

OptMuon: Closed-Loop Orthogonalized Momentum Methods for Stochastic Optimization with Zero-Noise Optimality

math.OC · 2026-06-07 · unverdicted · novelty 6.0

OptMuon combines orthogonalized momentum with trajectory-dependent AdaGrad-Norm adaptation to obtain expected-stationarity rates of order T^{-1/2} + sigma^{1/2}T^{-1/4} or T^{-1/2} + sigma^{1/3}T^{-1/3} that reduce to near-optimal deterministic first-order rates in the zero-noise regime.

citing papers explorer

Showing 2 of 2 citing papers.

On the Convergence of Muon and Beyond cs.LG · 2025-09-19 · unverdicted · none · ref 18
Muon-MVR2 attains the optimal anytime convergence rate of ~O(T^{-1/3}) in stochastic non-convex settings under horizon-free schedules.
OptMuon: Closed-Loop Orthogonalized Momentum Methods for Stochastic Optimization with Zero-Noise Optimality math.OC · 2026-06-07 · unverdicted · none · ref 84
OptMuon combines orthogonalized momentum with trajectory-dependent AdaGrad-Norm adaptation to obtain expected-stationarity rates of order T^{-1/2} + sigma^{1/2}T^{-1/4} or T^{-1/2} + sigma^{1/3}T^{-1/3} that reduce to near-optimal deterministic first-order rates in the zero-noise regime.

Convergence of adam for non-convex objectives: Relaxed hyperparameters and non-ergodic case

fields

years

verdicts

representative citing papers

citing papers explorer