pith. sign in

arxiv: 1901.09865 · v3 · pith:25UHMCGDnew · submitted 2019-01-28 · 🧮 math.OC · cs.DC

Asynchronous Accelerated Proximal Stochastic Gradient for Strongly Convex Distributed Finite Sums

classification 🧮 math.OC cs.DC
keywords adfsfinitealgorithmsdistributedfunctionsnetworkproximalspeed-up
0
0 comments X
read the original abstract

In this work, we study the problem of minimizing the sum of strongly convex functions split over a network of $n$ nodes. We propose the decentralized and asynchronous algorithm ADFS to tackle the case when local functions are themselves finite sums with $m$ components. ADFS converges linearly when local functions are smooth, and matches the rates of the best known finite sum algorithms when executed on a single machine. On several machines, ADFS enjoys a $O (\sqrt{n})$ or $O(n)$ speed-up depending on the leading complexity term as long as the diameter of the network is not too big with respect to $m$. This also leads to a $\sqrt{m}$ speed-up over state-of-the-art distributed batch methods, which is the expected speed-up for finite sum algorithms. In terms of communication times and network parameters, ADFS scales as well as optimal distributed batch algorithms. As a side contribution, we give a generalized version of the accelerated proximal coordinate gradient algorithm using arbitrary sampling that we apply to a well-chosen dual problem to derive ADFS. Yet, ADFS uses primal proximal updates that only require solving one-dimensional problems for many standard machine learning applications. Finally, ADFS can be formulated for non-smooth objectives with equally good scaling properties. We illustrate the improvement of ADFS over state-of-the-art approaches with simulations.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.