A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

Aaron Sidford; Praneeth Netrapalli; Prateek Jain; Rahul Kidambi; Sham M. Kakade; Venkata Krishna Pillutla

arxiv: 1710.09430 · v2 · pith:KCNVS6PDnew · submitted 2017-10-25 · 📊 stat.ML · cs.LG· math.OC

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

Prateek Jain , Sham M. Kakade , Rahul Kidambi , Praneeth Netrapalli , Venkata Krishna Pillutla , Aaron Sidford This is my paper

classification 📊 stat.ML cs.LGmath.OC

keywords optimalitystochasticcharacterizingdescentgradientleastminimaxprocess

0 comments

read the original abstract

This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares. This result is obtained by analyzing SGD as a stochastic process and by sharply characterizing the stationary covariance matrix of this process. The finite rate optimality characterization captures the constant factors and addresses model mis-specification.

This paper has not been read by Pith yet.

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

discussion (0)