Testing Support Size More Efficiently Than Learning Histograms

Nathaniel Harms; Renato Ferreira Pinto Jr.

arxiv: 2410.18915 · v4 · pith:S2RDGBXPnew · submitted 2024-10-24 · 💻 cs.DS · cs.LG

Testing Support Size More Efficiently Than Learning Histograms

Renato Ferreira Pinto Jr. , Nathaniel Harms This is my paper

Pith reviewed 2026-05-23 19:16 UTC · model grok-4.3

classification 💻 cs.DS cs.LG

keywords support size testingdistribution testingsample complexityChebyshev polynomialshistogram learningproperty testingtotal variation distance

0 comments

The pith

Support size testing can be done with O(n/(ε log n) log(1/ε)) samples by analyzing Chebyshev approximations outside their standard range.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that testing whether an unknown distribution has support size at most n or is ε-far from any distribution with that support size requires fewer samples than learning the full histogram of the distribution. It achieves a sample complexity of O(n/(ε log n) log(1/ε)), which improves on the Θ(n/(ε² log n)) bound from histogram learning and comes close to the Ω(n/(ε log n)) lower bound. The same approach also produces stronger lower bounds on the support size given a fixed number of samples. The improvement rests on extending the analysis of Chebyshev polynomial approximations beyond the interval where they are known to work well. A reader would care because the result separates the sample needs of testing this property from those of learning the entire distribution.

Core claim

Testing whether p is supported on at most n elements or ε-far from any such distribution in total variation distance can be performed with O(n/(ε log n) log(1/ε)) samples from p; the algorithm also yields better lower bounds on support size than prior methods, and the proof proceeds by analyzing Chebyshev polynomial approximations outside the range where they were originally designed to approximate well.

What carries the argument

Chebyshev polynomial approximations analyzed outside their designed range to bound the tester's sample complexity.

If this is right

Support size testing requires only O(n/(ε log n) log(1/ε)) samples rather than the Θ(n/(ε² log n)) needed for histogram learning.
The achieved upper bound nearly matches the known Ω(n/(ε log n)) lower bound.
The same method produces larger lower bounds on support size than what follows from previous histogram-based approaches.
The Chebyshev polynomial technique supplies a self-contained proof for the improved tester.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The efficiency gain may indicate that testing other simple distribution properties can also avoid the full cost of learning.
Extensions of the Chebyshev analysis could tighten bounds for related testing tasks such as uniformity or monotonicity.
The separation between testing and learning sample complexities might be quantified more sharply for support-size-like properties.

Load-bearing premise

The analysis of Chebyshev polynomial approximations outside the range where they are designed to be good approximations is valid and directly yields the improved sample complexity for support size testing.

What would settle it

A concrete distribution for which any tester using o(n/(ε log n) log(1/ε)) samples fails to distinguish support size at most n from being ε-far would falsify the claimed upper bound.

read the original abstract

Consider two problems about an unknown probability distribution $p$: 1. How many samples from $p$ are required to test if $p$ is supported on $n$ elements or not? Specifically, given samples from $p$, determine whether it is supported on at most $n$ elements, or it is "$\epsilon$-far" (in total variation distance) from being supported on $n$ elements. 2. Given $m$ samples from $p$, what is the largest lower bound on its support size that we can produce? The best known upper bound for problem (1) uses a general algorithm for learning the histogram of the distribution $p$, which requires $\Theta(\tfrac{n}{\epsilon^2 \log n})$ samples. We show that testing can be done more efficiently than learning the histogram, using only $O(\tfrac{n}{\epsilon \log n} \log(1/\epsilon))$ samples, nearly matching the best known lower bound of $\Omega(\tfrac{n}{\epsilon \log n})$. This algorithm also provides a better solution to problem (2), producing larger lower bounds on support size than what follows from previous work. The proof relies on an analysis of Chebyshev polynomial approximations outside the range where they are designed to be good approximations, and the paper is intended as an accessible self-contained exposition of the Chebyshev polynomial method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper improves support size testing to O(n log(1/ε)/(ε log n)) samples via Chebyshev analysis outside the usual interval, beating the histogram learning bound by a 1/ε factor.

read the letter

This paper shows support size testing can be done with O(n log(1/ε) / (ε log n)) samples, improving on the Θ(n / (ε² log n)) bound that comes from learning the full histogram. The gain comes from a direct analysis of Chebyshev polynomial approximations used beyond their standard approximation interval, and the same method also yields stronger lower bounds on support size from a given number of samples. The write-up is self-contained and positions itself as an accessible treatment of the polynomial technique, which is a plus for readers who want the details without chasing multiple prior papers. The lower bound match is close, so the result is nearly tight in the stated regime. The main point to check is whether the out-of-range error terms in the Chebyshev analysis stay controlled enough to deliver the claimed 1/ε improvement without hidden extra factors; the abstract presents this as the core technical step, and the paper claims the derivation is complete. No other gaps stand out from the stated claims. This is targeted at people working in distribution property testing. It is a concrete, self-contained improvement on a standard bound and deserves a serious referee.

Referee Report

1 major / 1 minor

Summary. The paper claims that support-size testing (distinguishing whether an unknown distribution p has support size ≤n or is ε-far in TV distance from any distribution with support ≤n) can be performed with O(n/(ε log n) log(1/ε)) samples. This improves on the Θ(n/(ε² log n)) sample bound obtained by reducing to histogram learning and nearly matches the Ω(n/(ε log n)) lower bound. The same technique yields improved lower bounds on support size. The proof is presented as a self-contained analysis of Chebyshev polynomial approximations extended outside their standard approximation interval.

Significance. If the central analysis holds, the result is significant for distribution property testing: it separates testing from learning for a natural property and demonstrates that Chebyshev-based polynomial methods can yield near-optimal testers without requiring full histogram recovery. The self-contained exposition of the polynomial technique is a positive contribution to the literature.

major comments (1)

[§4] §4 (Chebyshev analysis outside the design interval): The central sample-complexity improvement from Θ(1/ε²) to O(1/ε) rests on the claim that the error incurred by the Chebyshev approximation when evaluated outside its nominal range remains small enough not to re-introduce an extra 1/ε factor. The manuscript states that the analysis is self-contained, but the provided derivation does not explicitly bound the out-of-range contribution (e.g., via a concrete lemma relating the minimax error on an enlarged interval to the final tester threshold). Without this bound, it is unclear whether the stated O(n/(ε log n) log(1/ε)) bound is achieved or whether hidden factors cancel the improvement.

minor comments (1)

[§1] The abstract and introduction use “n” both for the support-size parameter and (implicitly) for the universe size; a single clarifying sentence in §1 would remove ambiguity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for highlighting the significance of separating testing from learning in this setting. We address the sole major comment below.

read point-by-point responses

Referee: [§4] §4 (Chebyshev analysis outside the design interval): The central sample-complexity improvement from Θ(1/ε²) to O(1/ε) rests on the claim that the error incurred by the Chebyshev approximation when evaluated outside its nominal range remains small enough not to re-introduce an extra 1/ε factor. The manuscript states that the analysis is self-contained, but the provided derivation does not explicitly bound the out-of-range contribution (e.g., via a concrete lemma relating the minimax error on an enlarged interval to the final tester threshold). Without this bound, it is unclear whether the stated O(n/(ε log n) log(1/ε)) bound is achieved or whether hidden factors cancel the improvement.

Authors: We agree that an explicit bound on the out-of-range error would make the argument clearer and easier to verify. In the revised version we will add a short lemma (new Lemma 4.3) that directly relates the minimax error of the degree-d Chebyshev approximant on the enlarged interval [0,1+δ] (with δ = Θ(ε log n / n) as used by the tester) to the acceptance threshold. The lemma uses the standard bound |T_k(x)| ≤ (x + sqrt(x²-1))^k for x>1 together with the concrete choice of d and the approximation interval in our construction to show that the extraneous error is at most O(ε / log(1/ε)). This is absorbed into the existing O(log(1/ε)) factor without restoring a 1/ε² dependence, confirming that the claimed O(n/(ε log n) log(1/ε)) sample bound holds. We thank the referee for prompting this clarification. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation is self-contained polynomial analysis

full rationale

The paper derives an improved sample complexity for support size testing via analysis of Chebyshev polynomial approximations, presented as a self-contained exposition. No load-bearing steps reduce to self-citations, fitted inputs renamed as predictions, or self-definitional equivalences. The central upper bound O(n/(ε log n) log(1/ε)) is obtained from explicit approximation error bounds rather than by construction from the input histogram learner or prior author results. This matches the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard tools from approximation theory and probability; no free parameters, new entities, or ad-hoc axioms are indicated in the abstract.

axioms (1)

standard math Standard properties of Chebyshev polynomials and total variation distance in distribution testing
The proof relies on extending Chebyshev approximations beyond their designed range.

pith-pipeline@v0.9.0 · 5777 in / 1273 out tokens · 36206 ms · 2026-05-23T19:16:21.376259+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The proof relies on an analysis of Chebyshev polynomial approximations outside the range where they are designed to be good approximations... Pd(x) := −δ Td(ψ(x)) ... Q(x) := 1 + e^{-m x} P(x)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that testing can be done more efficiently than learning the histogram, using only O(n/(ε log n) log(1/ε)) samples

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 1 internal anchor

[1]

A unified maximum likelihood approach for estimating symmetric properties of discrete distributions

Jayadev Acharya, Hirakendu Das, Alon Orlitsky, and Ananda Theertha Suresh. A unified maximum likelihood approach for estimating symmetric properties of discrete distributions. In , volume 70, pages 11--21, 2017

work page 2017
[2]

Refining the adaptivity notion in the huge object model

Tomer Adar and Eldar Fischer. Refining the adaptivity notion in the huge object model. In , 2024

work page 2024
[3]

Support testing in the huge object model

Tomer Adar, Eldar Fischer, and Amit Levi. Support testing in the huge object model. In , 2024

work page 2024
[4]

Distribution testing lower bounds via reductions from communication complexity

Eric Blais, Clément L Canonne, and Tom Gur. Distribution testing lower bounds via reductions from communication complexity. , 11(2):1--37, 2019

work page 2019
[5]

Learnability and the vapnik-chervonenkis dimension

Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K Warmuth. Learnability and the vapnik-chervonenkis dimension. Journal of the ACM (JACM) , 36(4):929--965, 1989

work page 1989
[6]

Estimating the number of species: a review

John Bunge and Michael Fitzpatrick. Estimating the number of species: a review. Journal of the American statistical Association , 88(421):364--373, 1993

work page 1993
[7]

VC dimension and distribution-free sample-based testing

Eric Blais, Renato Ferreira Pinto Jr , and Nathaniel Harms. VC dimension and distribution-free sample-based testing. In , 2021

work page 2021
[8]

A survey on distribution testing: Your data is big

Cl \'e ment Canonne. A survey on distribution testing: Your data is big. but is it blue? , 2020

work page 2020
[9]

Testing for families of distributions via the fourier transform

Cl \'e ment L Canonne, Ilias Diakonikolas, and Alistair Stewart. Testing for families of distributions via the fourier transform. In , 2018

work page 2018
[10]

The relation between the number of species and the number of individuals in a random sample of an animal population

Ronald A Fisher, A Steven Corbet, and Carrington B Williams. The relation between the number of species and the number of individuals in a random sample of an animal population. The Journal of Animal Ecology , pages 42--58, 1943

work page 1943
[11]

Distribution testing under the parity trace

Renato Ferreira Pinto Jr. and Nathaniel Harms. Distribution testing under the parity trace. arXiv preprint arXiv:2304.01374 , 2023

work page arXiv 2023
[12]

Property testing and its connection to learning and approximation

Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. , 45(4):653--750, 1998

work page 1998
[13]

Introduction to property testing

Oded Goldreich. Introduction to property testing . Cambridge University Press, 2017

work page 2017
[14]

On the complexity of estimating the effective support size

Oded Goldreich. On the complexity of estimating the effective support size. ECCC, TR19-088 , 2019

work page 2019
[15]

Testing Bipartitness in an Augmented VDF Bounded-Degree Graph Model

Oded Goldreich. Testing bipartitness in an augmented vdf bounded-degree graph model. arXiv preprint arXiv:1905.03070 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1905
[16]

On the estimation of the number of classes in a population

Leo A Goodman. On the estimation of the number of classes in a population. The Annals of Mathematical Statistics , 20(4):572--579, 1949

work page 1949
[17]

I.J. Good. The population frequencies of species and the estimation of population parameters. Biometrika , pages 237--264, 1953

work page 1953
[18]

On sample-based testers

Oded Goldreich and Dana Ron. On sample-based testers. , 8(2):1--54, 2016

work page 2016
[19]

On the relation between the relative earth mover distance and the variation distance (an exposition)

Oded Goldreich and Dana Ron. On the relation between the relative earth mover distance and the variation distance (an exposition). In Oded Goldreich, editor, Computational Complexity and Property Testing: On the Interplay Between Randomness and Computation , pages 141--151. Springer Cham, 2020

work page 2020
[20]

Testing distributions of huge objects

Oded Goldreich and Dana Ron. Testing distributions of huge objects. TheoretiCS , 2, 2023

work page 2023
[21]

The optimal sample complexity of PAC learning

Steve Hanneke. The optimal sample complexity of PAC learning. , 17(38):1--15, 2016

work page 2016
[22]

CSTheory stackexchange answer

Steve Hanneke. CSTheory stackexchange answer. https://cstheory.stackexchange.com/questions/40161/proper-pac-learning-vc-dimension-bounds, 2019. Accessed 2024-09-10

work page 2019
[23]

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

Yanjun Han, Jiantao Jiao, and Tsachy Weissman. Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance. In , pages 3189--3221. PMLR, 2018

work page 2018
[24]

The broad optimality of profile maximum likelihood

Yi Hao and Alon Orlitsky. The broad optimality of profile maximum likelihood. , 2019

work page 2019
[25]

Unified sample-optimal property estimation in near-linear time

Yi Hao and Alon Orlitsky. Unified sample-optimal property estimation in near-linear time. In , 2019

work page 2019
[26]

Online versus offline adversaries in property testing

Esty Kelman, Ephraim Linder, and Sofya Raskhodnikova. Online versus offline adversaries in property testing. In , pages 65:1--65:18, 2025

work page 2025
[27]

Estimating the effective support size in constant query complexity

Shyam Narayanan and Jakub T e tek. Estimating the effective support size in constant query complexity. In , pages 242--252, 2023

work page 2023
[28]

OEIS sequence A008310

OEIS . OEIS sequence A008310 . https://oeis.org/A008310. Accessed 2024-08-13

work page 2024
[29]

Tolerant property testing and distance approximation

Michal Parnas, Dana Ron, and Ronitt Rubinfeld. Tolerant property testing and distance approximation. Journal of Computer and System Sciences , 72(6):1012--1042, 2006

work page 2006
[30]

Almost optimal distribution-free sample-based testing of k-modality

Dana Ron and Asaf Rosin. Almost optimal distribution-free sample-based testing of k-modality. In , 2020

work page 2020
[31]

Optimal distribution-free sample-based testing of subsequence-freeness with one-sided error

Dana Ron and Asaf Rosin. Optimal distribution-free sample-based testing of subsequence-freeness with one-sided error. , 14(1):1--31, 2022

work page 2022
[32]

Strong lower bounds for approximating distribution support size and the distinct elements problem

Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. Strong lower bounds for approximating distribution support size and the distinct elements problem. , 39(3):813--842, 2009

work page 2009
[33]

Estimating the unseen: an n/ (n) -sample estimator for entropy and support size, shown optimal via new CLTs

Gregory Valiant and Paul Valiant. Estimating the unseen: an n/ (n) -sample estimator for entropy and support size, shown optimal via new CLTs . In , 2011

work page 2011
[34]

The power of linear estimators

Gregory Valiant and Paul Valiant. The power of linear estimators. In , 2011

work page 2011
[35]

Instance optimal learning of discrete distributions

Gregory Valiant and Paul Valiant. Instance optimal learning of discrete distributions. In , 2016

work page 2016
[36]

Estimating the unseen: improved estimators for entropy and other properties

Gregory Valiant and Paul Valiant. Estimating the unseen: improved estimators for entropy and other properties. , 2017

work page 2017
[37]

Chebyshev polynomials, moment matching, and optimal estimation of the unseen

Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal estimation of the unseen. The Annals of Statistics , 47(2):857--883, 2019

work page 2019
[38]

Polynomial methods in statistical inference: theory and practice

Yihong Wu and Pengkun Yang. Polynomial methods in statistical inference: theory and practice . Foundations and Trends in Communications and Information Theory. now Publishers, 2020

work page 2020

[1] [1]

A unified maximum likelihood approach for estimating symmetric properties of discrete distributions

Jayadev Acharya, Hirakendu Das, Alon Orlitsky, and Ananda Theertha Suresh. A unified maximum likelihood approach for estimating symmetric properties of discrete distributions. In , volume 70, pages 11--21, 2017

work page 2017

[2] [2]

Refining the adaptivity notion in the huge object model

Tomer Adar and Eldar Fischer. Refining the adaptivity notion in the huge object model. In , 2024

work page 2024

[3] [3]

Support testing in the huge object model

Tomer Adar, Eldar Fischer, and Amit Levi. Support testing in the huge object model. In , 2024

work page 2024

[4] [4]

Distribution testing lower bounds via reductions from communication complexity

Eric Blais, Clément L Canonne, and Tom Gur. Distribution testing lower bounds via reductions from communication complexity. , 11(2):1--37, 2019

work page 2019

[5] [5]

Learnability and the vapnik-chervonenkis dimension

Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K Warmuth. Learnability and the vapnik-chervonenkis dimension. Journal of the ACM (JACM) , 36(4):929--965, 1989

work page 1989

[6] [6]

Estimating the number of species: a review

John Bunge and Michael Fitzpatrick. Estimating the number of species: a review. Journal of the American statistical Association , 88(421):364--373, 1993

work page 1993

[7] [7]

VC dimension and distribution-free sample-based testing

Eric Blais, Renato Ferreira Pinto Jr , and Nathaniel Harms. VC dimension and distribution-free sample-based testing. In , 2021

work page 2021

[8] [8]

A survey on distribution testing: Your data is big

Cl \'e ment Canonne. A survey on distribution testing: Your data is big. but is it blue? , 2020

work page 2020

[9] [9]

Testing for families of distributions via the fourier transform

Cl \'e ment L Canonne, Ilias Diakonikolas, and Alistair Stewart. Testing for families of distributions via the fourier transform. In , 2018

work page 2018

[10] [10]

The relation between the number of species and the number of individuals in a random sample of an animal population

Ronald A Fisher, A Steven Corbet, and Carrington B Williams. The relation between the number of species and the number of individuals in a random sample of an animal population. The Journal of Animal Ecology , pages 42--58, 1943

work page 1943

[11] [11]

Distribution testing under the parity trace

Renato Ferreira Pinto Jr. and Nathaniel Harms. Distribution testing under the parity trace. arXiv preprint arXiv:2304.01374 , 2023

work page arXiv 2023

[12] [12]

Property testing and its connection to learning and approximation

Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. , 45(4):653--750, 1998

work page 1998

[13] [13]

Introduction to property testing

Oded Goldreich. Introduction to property testing . Cambridge University Press, 2017

work page 2017

[14] [14]

On the complexity of estimating the effective support size

Oded Goldreich. On the complexity of estimating the effective support size. ECCC, TR19-088 , 2019

work page 2019

[15] [15]

Testing Bipartitness in an Augmented VDF Bounded-Degree Graph Model

Oded Goldreich. Testing bipartitness in an augmented vdf bounded-degree graph model. arXiv preprint arXiv:1905.03070 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1905

[16] [16]

On the estimation of the number of classes in a population

Leo A Goodman. On the estimation of the number of classes in a population. The Annals of Mathematical Statistics , 20(4):572--579, 1949

work page 1949

[17] [17]

I.J. Good. The population frequencies of species and the estimation of population parameters. Biometrika , pages 237--264, 1953

work page 1953

[18] [18]

On sample-based testers

Oded Goldreich and Dana Ron. On sample-based testers. , 8(2):1--54, 2016

work page 2016

[19] [19]

On the relation between the relative earth mover distance and the variation distance (an exposition)

Oded Goldreich and Dana Ron. On the relation between the relative earth mover distance and the variation distance (an exposition). In Oded Goldreich, editor, Computational Complexity and Property Testing: On the Interplay Between Randomness and Computation , pages 141--151. Springer Cham, 2020

work page 2020

[20] [20]

Testing distributions of huge objects

Oded Goldreich and Dana Ron. Testing distributions of huge objects. TheoretiCS , 2, 2023

work page 2023

[21] [21]

The optimal sample complexity of PAC learning

Steve Hanneke. The optimal sample complexity of PAC learning. , 17(38):1--15, 2016

work page 2016

[22] [22]

CSTheory stackexchange answer

Steve Hanneke. CSTheory stackexchange answer. https://cstheory.stackexchange.com/questions/40161/proper-pac-learning-vc-dimension-bounds, 2019. Accessed 2024-09-10

work page 2019

[23] [23]

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

Yanjun Han, Jiantao Jiao, and Tsachy Weissman. Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance. In , pages 3189--3221. PMLR, 2018

work page 2018

[24] [24]

The broad optimality of profile maximum likelihood

Yi Hao and Alon Orlitsky. The broad optimality of profile maximum likelihood. , 2019

work page 2019

[25] [25]

Unified sample-optimal property estimation in near-linear time

Yi Hao and Alon Orlitsky. Unified sample-optimal property estimation in near-linear time. In , 2019

work page 2019

[26] [26]

Online versus offline adversaries in property testing

Esty Kelman, Ephraim Linder, and Sofya Raskhodnikova. Online versus offline adversaries in property testing. In , pages 65:1--65:18, 2025

work page 2025

[27] [27]

Estimating the effective support size in constant query complexity

Shyam Narayanan and Jakub T e tek. Estimating the effective support size in constant query complexity. In , pages 242--252, 2023

work page 2023

[28] [28]

OEIS sequence A008310

OEIS . OEIS sequence A008310 . https://oeis.org/A008310. Accessed 2024-08-13

work page 2024

[29] [29]

Tolerant property testing and distance approximation

Michal Parnas, Dana Ron, and Ronitt Rubinfeld. Tolerant property testing and distance approximation. Journal of Computer and System Sciences , 72(6):1012--1042, 2006

work page 2006

[30] [30]

Almost optimal distribution-free sample-based testing of k-modality

Dana Ron and Asaf Rosin. Almost optimal distribution-free sample-based testing of k-modality. In , 2020

work page 2020

[31] [31]

Optimal distribution-free sample-based testing of subsequence-freeness with one-sided error

Dana Ron and Asaf Rosin. Optimal distribution-free sample-based testing of subsequence-freeness with one-sided error. , 14(1):1--31, 2022

work page 2022

[32] [32]

Strong lower bounds for approximating distribution support size and the distinct elements problem

Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. Strong lower bounds for approximating distribution support size and the distinct elements problem. , 39(3):813--842, 2009

work page 2009

[33] [33]

Estimating the unseen: an n/ (n) -sample estimator for entropy and support size, shown optimal via new CLTs

Gregory Valiant and Paul Valiant. Estimating the unseen: an n/ (n) -sample estimator for entropy and support size, shown optimal via new CLTs . In , 2011

work page 2011

[34] [34]

The power of linear estimators

Gregory Valiant and Paul Valiant. The power of linear estimators. In , 2011

work page 2011

[35] [35]

Instance optimal learning of discrete distributions

Gregory Valiant and Paul Valiant. Instance optimal learning of discrete distributions. In , 2016

work page 2016

[36] [36]

Estimating the unseen: improved estimators for entropy and other properties

Gregory Valiant and Paul Valiant. Estimating the unseen: improved estimators for entropy and other properties. , 2017

work page 2017

[37] [37]

Chebyshev polynomials, moment matching, and optimal estimation of the unseen

Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal estimation of the unseen. The Annals of Statistics , 47(2):857--883, 2019

work page 2019

[38] [38]

Polynomial methods in statistical inference: theory and practice

Yihong Wu and Pengkun Yang. Polynomial methods in statistical inference: theory and practice . Foundations and Trends in Communications and Information Theory. now Publishers, 2020

work page 2020