pith. machine review for the scientific record. sign in

arxiv: 1704.08227 · v2 · submitted 2017-04-26 · 📊 stat.ML · cs.LG· math.OC· math.ST· stat.TH

Recognition: unknown

Accelerating Stochastic Gradient Descent For Least Squares Regression

Authors on Pith no claims yet
classification 📊 stat.ML cs.LGmath.OCmath.STstat.TH
keywords stochasticgradientaccelerateddescentaccelerationcharacterizationleastmade
0
0 comments X
read the original abstract

There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in d'Aspremont 2008 and Devolder, Glineur, and Nesterov 2014. This work considers these issues for the special case of stochastic approximation for the least squares regression problem, and our main result refutes the conventional wisdom by showing that acceleration can be made robust to statistical errors. In particular, this work introduces an accelerated stochastic gradient method that provably achieves the minimax optimal statistical risk faster than stochastic gradient descent. Critical to the analysis is a sharp characterization of accelerated stochastic gradient descent as a stochastic process. We hope this characterization gives insights towards the broader question of designing simple and effective accelerated stochastic methods for more general convex and non-convex optimization problems.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SHANG++: Robust Stochastic Acceleration under Multiplicative Noise

    math.OC 2026-03 unverdicted novelty 6.0

    SHANG++ delivers faster convergence and stronger robustness to multiplicative noise in stochastic optimization for both convex and strongly convex problems, with explicit parameters and competitive deep-learning results.