Iterate averaging as regularization for stochastic gradient descent

Gergely Neu; Lorenzo Rosasco

arxiv: 1802.08009 · v1 · pith:C3YJS7WBnew · submitted 2018-02-22 · 💻 cs.LG · stat.ML

Iterate averaging as regularization for stochastic gradient descent

Gergely Neu , Lorenzo Rosasco This is my paper

classification 💻 cs.LG stat.ML

keywords averaginggradientstochasticaveragemethodsregressionschemeanalyze

0 comments

read the original abstract

We propose and analyze a variant of the classic Polyak-Ruppert averaging scheme, broadly used in stochastic gradient methods. Rather than a uniform average of the iterates, we consider a weighted average, with weights decaying in a geometric fashion. In the context of linear least squares regression, we show that this averaging scheme has a the same regularizing effect, and indeed is asymptotically equivalent, to ridge regression. In particular, we derive finite-sample bounds for the proposed approach that match the best known results for regularized stochastic gradient methods.

This paper has not been read by Pith yet.

Iterate averaging as regularization for stochastic gradient descent

discussion (0)