pith. machine review for the scientific record. sign in

arxiv: 1603.05346 · v2 · submitted 2016-03-17 · 💻 cs.DS

Recognition: unknown

Optimal Quantile Approximation in Streams

Authors on Pith no claims yet
classification 💻 cs.DS
keywords varepsilonquantilesketchdeltaallowsapproximateconstructionitems
0
0 comments X
read the original abstract

This paper resolves one of the longest standing basic problems in the streaming computational model. Namely, optimal construction of quantile sketches. An $\varepsilon$ approximate quantile sketch receives a stream of items $x_1,\ldots,x_n$ and allows one to approximate the rank of any query up to additive error $\varepsilon n$ with probability at least $1-\delta$. The rank of a query $x$ is the number of stream items such that $x_i \le x$. The minimal sketch size required for this task is trivially at least $1/\varepsilon$. Felber and Ostrovsky obtain a $O((1/\varepsilon)\log(1/\varepsilon))$ space sketch for a fixed $\delta$. To date, no better upper or lower bounds were known even for randomly permuted streams or for approximating a specific quantile, e.g.,\ the median. This paper obtains an $O((1/\varepsilon)\log \log (1/\delta))$ space sketch and a matching lower bound. This resolves the open problem and proves a qualitative gap between randomized and deterministic quantile sketching. One of our contributions is a novel representation and modification of the widely used merge-and-reduce construction. This subtle modification allows for an analysis which is both tight and extremely simple. Similar techniques should be useful for improving other sketching objectives and geometric coreset constructions.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Icicle: Scalable Metadata Indexing and Real-Time Monitoring for HPC File Systems

    cs.DC 2026-04 unverdicted novelty 7.0

    Icicle delivers a unified, up-to-date metadata view for HPC file systems by ingesting both periodic snapshots and real-time events via Kafka and Flink, achieving order-of-magnitude throughput gains over prior batch tools.