pith. sign in

arxiv: 1110.3100 · v1 · pith:DJ67MQEBnew · submitted 2011-10-14 · 💻 cs.DS

Telling Two Distributions Apart: a Tight Characterization

classification 💻 cs.DS
keywords samplesdistributionsbounddomainknownmanyupperaccess
0
0 comments X
read the original abstract

We consider the problem of distinguishing between two arbitrary black-box distributions defined over the domain [n], given access to $s$ samples from both. It is known that in the worst case O(n^{2/3}) samples is both necessary and sufficient, provided that the distributions have L1 difference of at least {\epsilon}. However, it is also known that in many cases fewer samples suffice. We identify a new parameter, that provides an upper bound on how many samples needed, and present an efficient algorithm that requires the number of samples independent of the domain size. Also for a large subclass of distributions we provide a lower bound, that matches our upper bound up to a poly-logarithmic factor.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.