pith. sign in

arxiv: 1803.02270 · v1 · pith:IZMPONFRnew · submitted 2018-03-06 · 💻 cs.DS

Revisiting Frequency Moment Estimation in Random Order Streams

classification 💻 cs.DS
keywords spaceepsilonboundmodelorderrandomstreamsbits
0
0 comments X
read the original abstract

We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments $F_p$ for $0 < p < 2$ of an underlying $n$-dimensional vector presented as a sequence of additive updates in a stream. It is well-known that using $p$-stable distributions one can approximate any of these moments up to a multiplicative $(1+\epsilon)$-factor using $O(\epsilon^{-2} \log n)$ bits of space, and this space bound is optimal up to a constant factor in the turnstile streaming model. We show that surprisingly, if one instead considers the popular random-order model of insertion-only streams, in which the updates to the underlying vector arrive in a random order, then one can beat this space bound and achieve $\tilde{O}(\epsilon^{-2} + \log n)$ bits of space, where the $\tilde{O}$ hides poly$(\log(1/\epsilon) + \log \log n)$ factors. If $\epsilon^{-2} \approx \log n$, this represents a roughly quadratic improvement in the space achievable in turnstile streams. Our algorithm is in fact deterministic, and we show our space bound is optimal up to poly$(\log(1/\epsilon) + \log \log n)$ factors for deterministic algorithms in the random order model. We also obtain a similar improvement in space for $p = 2$ whenever $F_2 \gtrsim \log n\cdot F_1$.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Towards Optimal Moment Estimation in Streaming and Distributed Models

    cs.DS 2019-07 unverdicted novelty 7.0

    For p-moments with positive updates, achieves Õ(ε^{-2} + log n) space for p ≤ 1 without random order and Õ(ε^{-2}) max-communication for p in (1,2] in coordinator/blackboard models.