ErasureHead: Distributed Gradient Descent without Delays Using Approximate Gradient Coding

Dimitris Papailiopoulos; Hongyi Wang; Zachary Charles

arxiv: 1901.09671 · v1 · pith:SM6TLAROnew · submitted 2019-01-28 · 💻 cs.LG · cs.DC· cs.IT· math.IT· math.OC· stat.ML

ErasureHead: Distributed Gradient Descent without Delays Using Approximate Gradient Coding

Hongyi Wang , Zachary Charles , Dimitris Papailiopoulos This is my paper

classification 💻 cs.LG cs.DCcs.ITmath.ITmath.OCstat.ML

keywords gradientdistributederasureheadapproximatecodingdelaycodeddelays

0 comments

read the original abstract

We present ErasureHead, a new approach for distributed gradient descent (GD) that mitigates system delays by employing approximate gradient coding. Gradient coded distributed GD uses redundancy to exactly recover the gradient at each iteration from a subset of compute nodes. ErasureHead instead uses approximate gradient codes to recover an inexact gradient at each iteration, but with higher delay tolerance. Unlike prior work on gradient coding, we provide a performance analysis that combines both delay and convergence guarantees. We establish that down to a small noise floor, ErasureHead converges as quickly as distributed GD and has faster overall runtime under a probabilistic delay model. We conduct extensive experiments on real world datasets and distributed clusters and demonstrate that our method can lead to significant speedups over both standard and gradient coded GD.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Communication-Efficient Approximate Gradient Coding for Distributed Learning in Heterogeneous Systems
eess.SY 2026-05 unverdicted novelty 5.0

Derives a closed-form optimal gradient coding structure and bit allocation strategy to minimize residual error under an unbiasedness constraint for communication-efficient distributed learning in heterogeneous systems.