For p-moments with positive updates, achieves Õ(ε^{-2} + log n) space for p ≤ 1 without random order and Õ(ε^{-2}) max-communication for p in (1,2] in coordinator/blackboard models.
Revisiting Frequency Moment Estimation in Random Order Streams
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments $F_p$ for $0 < p < 2$ of an underlying $n$-dimensional vector presented as a sequence of additive updates in a stream. It is well-known that using $p$-stable distributions one can approximate any of these moments up to a multiplicative $(1+\epsilon)$-factor using $O(\epsilon^{-2} \log n)$ bits of space, and this space bound is optimal up to a constant factor in the turnstile streaming model. We show that surprisingly, if one instead considers the popular random-order model of insertion-only streams, in which the updates to the underlying vector arrive in a random order, then one can beat this space bound and achieve $\tilde{O}(\epsilon^{-2} + \log n)$ bits of space, where the $\tilde{O}$ hides poly$(\log(1/\epsilon) + \log \log n)$ factors. If $\epsilon^{-2} \approx \log n$, this represents a roughly quadratic improvement in the space achievable in turnstile streams. Our algorithm is in fact deterministic, and we show our space bound is optimal up to poly$(\log(1/\epsilon) + \log \log n)$ factors for deterministic algorithms in the random order model. We also obtain a similar improvement in space for $p = 2$ whenever $F_2 \gtrsim \log n\cdot F_1$.
fields
cs.DS 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Optimal Moment Estimation in Streaming and Distributed Models
For p-moments with positive updates, achieves Õ(ε^{-2} + log n) space for p ≤ 1 without random order and Õ(ε^{-2}) max-communication for p in (1,2] in coordinator/blackboard models.