Improved Distribution Estimation in ell_infty
Pith reviewed 2026-06-29 05:12 UTC · model grok-4.3
The pith
Improved minimax bounds for discrete distribution estimation under the infinity norm close several prior open questions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Improved bounds are derived for estimating discrete probability distributions under the ℓ∞ norm. These include minimax bounds in expectation and high-probability tail bounds. A fully empirical version of the tightest risk bound is provided along with the explicit form of the worst-case extremal distribution.
What carries the argument
The fully empirical tightest risk bound together with the identified worst-case extremal distribution, which together replace the non-empirical version from prior analysis.
If this is right
- The new bounds can be computed directly from the observed samples without knowledge of any unknown distribution parameters.
- High-probability guarantees now hold uniformly over all possible underlying distributions.
- The explicit form of the worst-case distribution can be used to verify the sharpness of the bound through direct simulation.
- Estimation procedures can be tuned to achieve the improved rate without additional assumptions on the support size.
Where Pith is reading between the lines
- The same empirical-bound technique might simplify analysis of related estimation problems that currently rely on non-empirical constants.
- Testing whether real data sets produce risk close to the identified extremal case would indicate when the bound is nearly tight in practice.
- Extensions to continuous or structured domains could reuse the extremal-distribution construction if the infinity-norm geometry is preserved.
Load-bearing premise
The sampling model, risk definitions, and formulation of the open questions are taken directly from a referenced prior paper without independent re-derivation.
What would settle it
Compute the empirical risk achieved by the estimator on samples drawn from the identified extremal distribution and check whether it matches the value predicted by the new bound up to the stated constants.
Figures
read the original abstract
We present improved bounds for estimating discrete probability distributions under the $\ell_\infty$ norm. These include minimax bounds in expectation and high-probability tail bounds. We resolve some of the open questions posed in Kontorovich and Painsky (JMLR, 2025) -- including a fully empirical version of the tightest risk bound they presented and identifying the form of the worst-case extremal distribution. Encouraging empirical results are reported as well.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to present improved minimax bounds (in expectation and high-probability) for estimating discrete probability distributions under the ℓ_∞ norm. It asserts resolutions of open questions from Kontorovich and Painsky (JMLR 2025), specifically a fully empirical version of their tightest risk bound and the explicit form of the worst-case extremal distribution, together with encouraging empirical results.
Significance. Resolving the cited open questions with a fully empirical tight bound and an identified extremal distribution would be a useful incremental advance in the theory of ℓ_∞ distribution estimation, provided the derivations are correct and the inherited sampling model is appropriate. The empirical results, if reproducible, would lend additional support.
major comments (2)
- [Abstract] Abstract: the central claims (improved bounds, fully empirical tight risk bound, and identification of the extremal distribution) are asserted without any derivations, proofs, or data details supplied in the available text, so the resolutions of the Kontorovich-Painsky open questions cannot be verified.
- The manuscript inherits the sampling model and risk definitions from Kontorovich and Painsky (JMLR 2025) without re-derivation; no check is possible on whether the new bounds reduce to prior quantities or introduce hidden mismatches.
Simulated Author's Rebuttal
We thank the referee for their review. The full manuscript (beyond the abstract) contains the requested derivations, proofs, and empirical details in Sections 3–5. We address the comments point-by-point below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claims (improved bounds, fully empirical tight risk bound, and identification of the extremal distribution) are asserted without any derivations, proofs, or data details supplied in the available text, so the resolutions of the Kontorovich-Painsky open questions cannot be verified.
Authors: The full manuscript provides complete derivations: Theorem 3.1 gives the improved minimax bound in expectation, Theorem 3.3 the high-probability tail bound, Theorem 4.1 the fully empirical tight risk bound (resolving the open question), and Theorem 4.2 the explicit form of the worst-case extremal distribution. Section 5 reports the empirical results with full experimental details and code. These sections allow direct verification of the resolutions. revision: no
-
Referee: The manuscript inherits the sampling model and risk definitions from Kontorovich and Painsky (JMLR 2025) without re-derivation; no check is possible on whether the new bounds reduce to prior quantities or introduce hidden mismatches.
Authors: Section 2 explicitly restates the multinomial sampling model and ℓ_∞ risk from the cited paper for self-contained reading. Remark 3.2 shows that our bounds recover the prior quantities as special cases. No hidden mismatches are introduced, as the new results are derived directly from the same model. revision: no
Circularity Check
No significant circularity identified
full rationale
The provided abstract and context show the paper extends prior work by resolving open questions from Kontorovich and Painsky (JMLR 2025), inheriting sampling models and risk definitions. No equations, derivations, or specific reductions are quoted that match any enumerated circularity pattern (e.g., self-definitional, fitted input called prediction, or load-bearing self-citation forcing the result). Follow-up papers routinely cite overlapping-author prior work without circularity when the new contributions (empirical bounds, extremal distribution form) are independent resolutions. Per hard rules, absent explicit quotes exhibiting construction equivalence, score remains 0.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Tight bounds for local glivenko-cantelli
Moïse Blanchard and Václav Voráček. Tight bounds for local glivenko-cantelli. InAlgorithmic Learning Theory, ALT 2024, Proceedings of Machine Learning Research. PMLR,
2024
-
[2]
Correlated binomial process
Moïse Blanchard, Doron Cohen, and Aryeh Kontorovich. Correlated binomial process. In Shipra Agrawal and Aaron Roth, editors,The Thirty Seventh Annual Conference on Learning Theory, June 30 - July 3, 2023, Edmonton, Canada, volume 247 ofProceedings of Machine Learning Research, pages 551–595. PMLR,
2023
-
[3]
15 Doron Cohen and Aryeh Kontorovich
URLhttps://proceedings.mlr.press/v247/blanchard24a.html. 15 Doron Cohen and Aryeh Kontorovich. Local glivenko-cantelli. In Gergely Neu and Lorenzo Rosasco, editors,The Thirty Sixth Annual Conference on Learning Theory, COLT 2023, 12-15 July 2023, Bangalore, India, volume 195 ofProceedings of Machine Learning Research, page
2023
-
[4]
ISSN 1042-9832. doi: 10.1002/(SICI) 1098-2418(199809)13:2<99::AID-RSA1>3.0.CO;2-M. URL http://dx.doi.org/10.1002/(SICI) 1098-2418(199809)13:2<99::AID-RSA1>3.0.CO;2-M. Aurélien Garivier and Olivier Cappé. The KL-UCB algorithm for bounded stochastic bandits and beyond. InProceedings of the 24th Annual Conference on Learning Theory, volume 19 of Proceedings ...
-
[5]
16 A Additional Results & Auxiliary Lemmas Lemma A.1(Negative covariance of multinomial components; Dubhashi and Ranjan (1998)).Let {X (j)}n j=1 be n i.i.d
URL https://onlinelibrary.wiley.com/doi/abs/10.1002/ sta4.314. 16 A Additional Results & Auxiliary Lemmas Lemma A.1(Negative covariance of multinomial components; Dubhashi and Ranjan (1998)).Let {X (j)}n j=1 be n i.i.d. samples from a multinomial distribution with parameter vectorp= (p1, p2, . . .) and single trial (i.e.,X (j) ∼Multinomial (1, p)). Let Xi...
1998
-
[6]
From Kontorovich and Painsky (2025, Lemma 12), we know that vi(p↓)≤ 1 i+1 for any distributionp
Let δi = vi(p↓)−v i(q↓) . From Kontorovich and Painsky (2025, Lemma 12), we know that vi(p↓)≤ 1 i+1 for any distributionp. Sincex, y≥0 =⇒ |x−y| ≤max(x, y), we have δi ≤max(v i(p↓), vi(q↓))≤ 1 i+ 1 , which implies log(i +
2025
-
[7]
Proof.We will show that δ(p,ˆp)≤ ∥p−ˆp∥∞ ≤a+b √ ˆv∗ +b p δ(p,ˆp)
n + 1 n log 2 δ . Proof.We will show that δ(p,ˆp)≤ ∥p−ˆp∥∞ ≤a+b √ ˆv∗ +b p δ(p,ˆp). and from there apply the proof technique from Kontorovich and Painsky (2025, Theorem 4). First, it follows from that proof that ∥p−ˆp∥∞ ≤a+b √ ˆv∗ +b p |v∗ −ˆv∗| ≤a+b √ ˆv∗ +b p δ(p,ˆp). Next, for all distributionsp, qwe have |pi(1−p i)−q i(1−q i)|=|p i −q i + (qi −p i)(qi...
2025
-
[8]
Then for everyq∈[p,1], Pr Y n ≥q ≤exp −n DKL (q||p) , where forp, q∈[0,1], DKL (q||p) :=qlog q p + (1−q) log 1−q 1−p
Lemma A.5(Chernoff bound for Binomial variables).Let n∈N and let Y∼Bin (n, p)with p∈[0,1]. Then for everyq∈[p,1], Pr Y n ≥q ≤exp −n DKL (q||p) , where forp, q∈[0,1], DKL (q||p) :=qlog q p + (1−q) log 1−q 1−p . 20 Lemma A.6(Anti-concentration for Binomial upper tails Zhang and Zhou (2020)).For anyβ > 1 there exist constantscβ, Cβ >0, depending only onβ, su...
2020
-
[9]
K–P (Thm 2)
= c2j′ (j′ + 1)2k2 ≥ c2 2(j′ + 1)2k2−1 . Substituting this back into our lower bound for the probability of the union: P sup j (ˆpj −p j)≥τ ! ≥(1−e −1) 1∧ c2 2(j′ + 1)2k2−1 . This provides a polynomial lower bound in terms of the truncation indexj′, which is related to the tail decay of the distribution. B Experiments We compare our fully empirical high-p...
2025
-
[10]
K–P (Thm 4)
δ + logn n . (These constants are not optimized; the purpose of the experiments is a like-for-like comparison under the same plotting protocol.) Results.Our fully empirical bound consistently improves upon the benchmark curve across all distributions and confidence levels. In settings with heavier tails (Zipf and real-world datasets), our bound substantia...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.