pith. sign in

arxiv: 1710.08255 · v2 · pith:URM2655Anew · submitted 2017-10-23 · 💻 cs.DS · cs.DC

Communication Efficient Checking of Big Data Operations

classification 💻 cs.DS cs.DC
keywords operationscommunicationdataaggregationalgorithmsanalysisaveragebingmann
0
0 comments X
read the original abstract

We propose fast probabilistic algorithms with low (i.e., sublinear in the input size) communication volume to check the correctness of operations in Big Data processing frameworks and distributed databases. Our checkers cover many of the commonly used operations, including sum, average, median, and minimum aggregation, as well as sorting, union, merge, and zip. An experimental evaluation of our implementation in Thrill (Bingmann et al., 2016) confirms the low overhead and high failure detection rate predicted by theoretical analysis.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.