pith. machine review for the scientific record. sign in

arxiv: 1703.01913 · v1 · submitted 2017-03-06 · 💻 cs.DS · cs.IT· cs.LG· math.IT· math.ST· stat.TH

Recognition: unknown

Near-Optimal Closeness Testing of Discrete Histogram Distributions

Authors on Pith no claims yet
classification 💻 cs.DS cs.ITcs.LGmath.ITmath.STstat.TH
keywords testingdistributionshistogramalgorithmboundclosenessdiscretehistograms
0
0 comments X
read the original abstract

We investigate the problem of testing the equivalence between two discrete histograms. A {\em $k$-histogram} over $[n]$ is a probability distribution that is piecewise constant over some set of $k$ intervals over $[n]$. Histograms have been extensively studied in computer science and statistics. Given a set of samples from two $k$-histogram distributions $p, q$ over $[n]$, we want to distinguish (with high probability) between the cases that $p = q$ and $\|p-q\|_1 \geq \epsilon$. The main contribution of this paper is a new algorithm for this testing problem and a nearly matching information-theoretic lower bound. Specifically, the sample complexity of our algorithm matches our lower bound up to a logarithmic factor, improving on previous work by polynomial factors in the relevant parameters. Our algorithmic approach applies in a more general setting and yields improved sample upper bounds for testing closeness of other structured distributions as well.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.