Batched Stochastic Gradient Descent with Weighted Sampling

Deanna Needell; Rachel Ward

arxiv: 1608.07641 · v2 · pith:EDRYUCHUnew · submitted 2016-08-27 · 🧮 math.NA · cs.NA

Batched Stochastic Gradient Descent with Weighted Sampling

Deanna Needell , Rachel Ward This is my paper

classification 🧮 math.NA cs.NA

keywords samplingbatchedweightedcomputationallydescentgradientstochasticalone

0 comments

read the original abstract

We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that by distributing the batches computationally, a significant speedup in the convergence rate is provably possible compared to either batched sampling or weighted sampling alone. We propose several computationally efficient schemes to approximate the optimal weights, and compute proposed sampling distributions explicitly for the least squares and hinge loss problems. We show both analytically and experimentally that substantial gains can be obtained.

This paper has not been read by Pith yet.

Batched Stochastic Gradient Descent with Weighted Sampling

discussion (0)