Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication
read the original abstract
We consider a large-scale matrix multiplication problem where the computation is carried out using a distributed system with a master node and multiple worker nodes, where each worker can store parts of the input matrices. We propose a computation strategy that leverages ideas from coding theory to design intermediate computations at the worker nodes, in order to efficiently deal with straggling workers. The proposed strategy, named as \emph{polynomial codes}, achieves the optimum recovery threshold, defined as the minimum number of workers that the master needs to wait for in order to compute the output. Furthermore, by leveraging the algebraic structure of polynomial codes, we can map the reconstruction problem of the final output to a polynomial interpolation problem, which can be solved efficiently. Polynomial codes provide order-wise improvement over the state of the art in terms of recovery threshold, and are also optimal in terms of several other metrics. Furthermore, we extend this code to distributed convolution and show its order-wise optimality.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Random Khatri-Rao-Product Codes for Numerically-Stable Distributed Matrix Multiplication
RKRP codes are MDS with probability 1, have identical communication/encoding costs to prior codes, lower average decoding complexity than OrthoPoly, and show substantially lower reconstruction error in numerical tests.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.