Exploiting Redundant Computation in Communication-Avoiding Algorithms for Algorithm-Based Fault Tolerance
classification
💻 cs.DC
keywords
algorithmscommunication-avoidingnumberredundantalgorithmalgorithm-basedallowcommunications
read the original abstract
Communication-avoiding algorithms allow redundant computations to minimize the number of inter-process communications. In this paper, we propose to exploit this redundancy for fault-tolerance purpose. We illustrate this idea with QR factorization of tall and skinny matrices, and we evaluate the number of failures our algorithm can tolerate under different semantics.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.