Berry-Esseen bounds for estimators of entropy and diversity indices on countable alphabets
Pith reviewed 2026-05-10 15:17 UTC · model grok-4.3
The pith
Berry-Esseen bounds provide explicit non-asymptotic rates for plug-in and bias-corrected entropy estimators on countable alphabets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A general non-asymptotic convergence rate is established for the plug-in estimator of a wide class of indices, including Simpson's index and Rényi's entropy. For Shannon entropy, explicit Berry-Esseen bounds are provided for the standard plug-in estimator as well as the Miller-Madow and jackknife estimators under i.i.d. sampling from a distribution on a countable alphabet.
What carries the argument
Berry-Esseen approximation applied to the standardized plug-in and bias-corrected estimators, with remainder terms controlled by moment or tail conditions on the probability mass function.
Load-bearing premise
The underlying distribution on the countable alphabet must obey unspecified moment or tail conditions that keep the approximation remainders bounded.
What would settle it
A countable probability distribution and sample size where the Kolmogorov distance of the normalized estimator to the standard normal exceeds the explicit upper bound stated in the paper.
read the original abstract
In the present paper, we derive Berry-Esseen bounds for the estimation of diversity indices on countable alphabets. A general non-asymptotic convergence rate is established for the plug-in estimator of a wide class of indices, including Simpson's index and Re\'{n}yi's entropy. For the practically crucial case of Shannon entropy, we provide explicit Berry-Esseen bounds for the standard plug-in estimator, as well as for two widely used bias-corrected variants, the Miller-Madow and the jackknife estimators.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript derives non-asymptotic Berry-Esseen bounds for plug-in estimators of a broad class of diversity indices (including Simpson's index and Rényi's entropy) on countable alphabets, and supplies explicit bounds for the plug-in, Miller-Madow, and jackknife estimators of Shannon entropy.
Significance. If the central derivations hold under correctly stated hypotheses, the results would supply useful finite-sample guarantees for entropy and diversity estimation when the support is countably infinite, a setting common in applications but where classical asymptotic theory is often insufficient.
major comments (2)
- [§3, Theorem 3.1] §3, Theorem 3.1 and the subsequent Berry-Esseen statement for the general functional: the hypotheses list only that the alphabet is countable and the samples are i.i.d.; they omit the tail-decay condition on p required to guarantee that the third-moment term sum_i E[|g'(p_i)(1_{X=i}-p_i)|^3] remains finite. Without this (roughly p_i = o(i^{-2/3}) in the worst case), the classical Berry-Esseen theorem cannot be applied uniformly over all countable alphabets, contradicting the claim of a general non-asymptotic rate.
- [§4, Theorem 4.2] §4, Theorem 4.2 (Shannon entropy case): the explicit constants for the plug-in estimator likewise rely on the same unstated third-moment bound; the proof sketch in §4.3 reduces the remainder to a term controlled only when sum p_i^{1/2} < ∞, which is not listed among the assumptions and is not implied by mere countability.
minor comments (2)
- [Abstract] Abstract: 'Re´{n}yi' should be spelled 'Rényi'.
- [§2.1] Notation in §2.1: the functional class is defined via g, but the moment condition needed for the remainder term is never written explicitly as an assumption on g or p.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments, which help clarify the scope of our results. We address each major comment below and will revise the manuscript accordingly to incorporate the necessary assumptions.
read point-by-point responses
-
Referee: [§3, Theorem 3.1] §3, Theorem 3.1 and the subsequent Berry-Esseen statement for the general functional: the hypotheses list only that the alphabet is countable and the samples are i.i.d.; they omit the tail-decay condition on p required to guarantee that the third-moment term sum_i E[|g'(p_i)(1_{X=i}-p_i)|^3] remains finite. Without this (roughly p_i = o(i^{-2/3}) in the worst case), the classical Berry-Esseen theorem cannot be applied uniformly over all countable alphabets, contradicting the claim of a general non-asymptotic rate.
Authors: We agree that the third-moment finiteness condition is required for the classical Berry-Esseen theorem to apply in the countable-alphabet setting. The original manuscript implicitly relied on this to ensure the bound is well-defined but did not state it explicitly among the hypotheses. In the revised version we will add the explicit assumption that sum_i E[|g'(p_i)(1_{X=i}-p_i)|^3] < ∞ (equivalently, a mild tail-decay condition on the probability vector p) and clarify that the non-asymptotic rate holds under this condition rather than for arbitrary countable alphabets. We will also include a brief discussion of common distributions (e.g., Zipf with exponent > 5/3) that satisfy the requirement. revision: yes
-
Referee: [§4, Theorem 4.2] §4, Theorem 4.2 (Shannon entropy case): the explicit constants for the plug-in estimator likewise rely on the same unstated third-moment bound; the proof sketch in §4.3 reduces the remainder to a term controlled only when sum p_i^{1/2} < ∞, which is not listed among the assumptions and is not implied by mere countability.
Authors: The referee correctly identifies that the remainder control in the proof of Theorem 4.2 for the plug-in estimator of Shannon entropy requires sum_i p_i^{1/2} < ∞. This condition is stronger than mere countability and was not listed. We will revise the statement of Theorem 4.2 to include this assumption, update the proof sketch in §4.3 to highlight where it is used, and add a remark on its practical relevance (e.g., it holds for all distributions with finite support or exponentially decaying tails). The Miller-Madow and jackknife bounds will be similarly clarified. revision: yes
Circularity Check
No circularity: direct derivation of Berry-Esseen bounds from classical inequalities
full rationale
The paper derives non-asymptotic convergence rates for plug-in, Miller-Madow, and jackknife estimators of entropy and diversity functionals on countable alphabets by applying the classical Berry-Esseen theorem to a centered sum of i.i.d. terms after a Taylor or influence-function expansion. No step reduces a claimed prediction to a fitted parameter by construction, invokes a self-citation as the sole justification for a uniqueness or ansatz claim, or renames an empirical pattern. The derivation relies on external moment or tail conditions on the pmf (which are stated as hypotheses, even if their precise form is technical), i.i.d. sampling, and standard probabilistic inequalities; these inputs are independent of the final bound expressions. The result is therefore self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Samples are i.i.d. draws from an unknown probability distribution supported on a countable alphabet.
- domain assumption Sufficient moment conditions hold on the probability masses to bound the remainder in the normal approximation.
Reference graph
Works this paper leans on
-
[1]
M. N. Chang and P. V. Rao, Berry-Esseen bound for the Kaplan-Meier estimator. Comm. Statist. Theory Methods 18 (1989), no. 12, 4647-4664
work page 1989
-
[2]
C. Chen, M. Grabchak, A. Stewart, J. L. Zhang and Z. Y. Zhang, Normal laws for two entropy estimators on infinite alphabets. Entropy 20 (2018), no. 5. 371
work page 2018
-
[3]
L. H. Y. Chen and Q. M. Shao, A non-uniform Berry-Esseen bound via Stein’s method. Probab. Theory Related Fields 120 (2001), no. 2, 236-254
work page 2001
-
[4]
M. Grabchak, E. Marcon, G. Lang and Z. Y. Zhang, The generalized Simpson’s entropy is a measure of biodiversity. PLoS one 12 (2017), no. 3. e0173305
work page 2017
-
[5]
M. Grabchak and Z. Y. Zhang, Asymptotic normality for plug-in estimators of diversity indices on countable alphabets. J. Nonparametr. Stat. 30 (2018), no. 3, 774-795
work page 2018
-
[6]
M. O. Hill, Diversity and Evenness: A Unifying Notation and Its Consequences. Ecology 54 (1973), 427-432
work page 1973
-
[7]
G. A. Miller and W.G. Madow, On the Maximum-Likelihood Estimate of the Shannon-Wiener Measure of Information. Air Force Cambridge Research Center Technical Report. 75 (1954), 54-75
work page 1954
-
[8]
Paninski, Estimation of entropy and mutual information
L. Paninski, Estimation of entropy and mutual information. Neural Comput. 15 (2003) no. 6, 1191-1253
work page 2003
-
[9]
G. P. Patil and C. Taillie, Diversity as a concept and its measurement. J. Amer. Statist. Assoc. 77 (1982), no. 379, 548-567
work page 1982
-
[10]
A. Pinchas, I. Ben-Gal and A. Painsky, A comparative analysis of discrete entropy estimators for large-alphabet problems. Entropy 26 (2024), no. 5. 369
work page 2024
-
[11]
R´ enyi, On measures of entropy and information, Math
A. R´ enyi, On measures of entropy and information, Math. Statist. and Prob. (1961), 547-561. 18 Z. H. YU AND Y. MIAO
work page 1961
-
[12]
Shannon, A mathematical theory of communication
C. Shannon, A mathematical theory of communication. Bell System Tech. J. 27 (1948), 379-423
work page 1948
-
[13]
E. H. Simpson, Measurement of diversity, Nature 163 (1949), 688
work page 1949
-
[14]
Zahl, Jackknifing an index of diversity
S. Zahl, Jackknifing an index of diversity. Ecology 58 (1977), no. 4, 907-913
work page 1977
-
[15]
J. L. Zhang and J. Y. Shi, Asymptotic normality for plug-in estimators of generalized Shannon’s entropy. Entropy 24 (2022), no. 5, 683-693
work page 2022
-
[16]
Z. Y. Zhang and M. Grabchak, Entropic representation and estimation of diversity indices. J. Nonparametr. Stat. 28 (2016), no. 3, 563-575
work page 2016
-
[17]
Z. Y. Zhang and X. Zhang, A normal law for the plug-in estimator of entropy. IEEE Trans. Inform. Theory 58 (2012), no. 5, 2745-2747
work page 2012
-
[18]
Z. Y. Zhang and J. Zhou, Re-parameterization of multinomial distributions and diversity indices. J. Statist. Plann. Inference 140 (2010), no. 7, 1731-1738. (Z. H. Yu)School of Mathematics and Statistics, Henan Normal University, Henan Province, 453007, China. Email address:zhenhongyu2022@126.com (Y. Miao)School of Mathematics and Statistics, Henan Normal ...
work page 2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.