{"paper":{"title":"Testing Closeness of Discrete Distributions","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":["math.PR","math.ST","stat.TH"],"primary_cat":"cs.DS","authors_text":"Lance Fortnow, Patrick White, Ronitt Rubinfeld, Tugkan Batu, Warren D. Smith","submitted_at":"2010-09-27T20:57:00Z","abstract_excerpt":"Given samples from two distributions over an $n$-element set, we wish to test whether these distributions are statistically close. We present an algorithm which uses sublinear in $n$, specifically, $O(n^{2/3}\\epsilon^{-8/3}\\log n)$, independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases when the distance between the distributions is small (less than $\\max\\{\\epsilon^{4/3}n^{-1/3}/32, \\epsilon n^{-1/2}/4\\}$) or large (more than $\\epsilon$) in $\\ell_1$ distance. This result can "},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"1009.5397","kind":"arxiv","version":2},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}