Recognition: 2 theorem links
· Lean TheoremOn central limit theorems for Ewens-Pitman model
Pith reviewed 2026-05-15 10:22 UTC · model grok-4.3
The pith
Fluctuations of the component count in Ewens-Pitman partitions consist of two conditionally independent parts given the alpha-diversity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the component count, equivalently the occupancy count of an infinite urn model with frequencies (P_j), obeys a quenched functional central limit theorem in which the limiting centered and scaled process decomposes as the sum of a sampling fluctuation given (P_j) and a fluctuation coming from the random (P_j), and these two Gaussian processes are conditionally independent given the alpha-diversity.
What carries the argument
The representation of component count as occupancy count in an infinite urn scheme with frequencies (P_j), together with the quenched functional central limit theorem that separates sampling noise from frequency noise.
If this is right
- The total limiting variance of the component count is the sum of the sampling variance and the frequency variance.
- The functional convergence supplies tightness and pathwise limits in the Skorokhod space rather than only finite-dimensional distributions.
- Joint limit theorems for several statistics become simpler because the two noise sources are independent given the alpha-diversity.
- The decomposition applies uniformly over the parameter range alpha in (0,1) and theta greater than -alpha.
Where Pith is reading between the lines
- Estimators of the alpha-diversity could be constructed by subtracting an estimate of the sampling fluctuation from observed component-count variance.
- The same two-part decomposition may hold for other exchangeable partition models whose frequencies satisfy similar almost-sure convergence.
- Conditioning simulations on the realized alpha-diversity should produce residuals whose cross-covariance with the frequency estimator is near zero.
Load-bearing premise
The earlier limit theorems for occupancy counts with fixed frequencies apply directly once the random frequencies (P_j) from the Chinese restaurant process are inserted.
What would settle it
A numerical check or exact calculation showing that the conditional covariance between the sampling fluctuation term and the frequency fluctuation term fails to vanish when conditioned on the alpha-diversity would falsify the claimed conditional independence.
read the original abstract
We establish a quenched functional central limit theorem for the total number of components of random partitions induced by Chinese restaurant process with parameters $(\alpha,\theta), \alpha\in(0,1), \theta>-\alpha$. With $P_j$ denoting the asymptotic frequency of $j$-th table, it is well-known that the component count has the same law as the occupancy count of an infinite urn scheme with sampling frequencies being $(P_j)_{j\in\mathbb N}$. Our analysis follows this approach and is based on earlier results of Karlin (1967) and Durieu and Wang (2016). In words, our result reveals that the fluctuations of component count consist of two parts, one due to the sampling effect given the asymptotic frequencies $(P_j)_{j\in\mathbb N}$, the other due to the fluctuations of the random asymptotic frequencies, and in the limit the fluctuations of two parts are conditionally independent given the $\alpha$-diversity. Our result strengthens a recent central limit theorem obtained by Bercu and Favaro (2024) via a different method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper establishes a quenched functional central limit theorem for the total number of components in random partitions generated by the Chinese restaurant process (Ewens-Pitman model) with parameters (α, θ) where α ∈ (0,1) and θ > −α. Using the known equivalence to occupancy counts in an infinite urn scheme with frequencies (P_j), the fluctuations are decomposed into a sampling term conditional on the fixed (P_j) and a term arising from the randomness of the asymptotic frequencies (P_j); these two contributions are shown to be conditionally independent given the α-diversity. The argument invokes Karlin (1967) and Durieu-Wang (2016) and strengthens the non-functional CLT of Bercu-Favaro (2024).
Significance. If the quenched functional convergence holds, the decomposition supplies a precise separation of sampling variability from frequency variability together with conditional independence given the α-diversity. This refines existing limit theorems for component counts in Pitman-Yor partitions and supplies a template that can be reused for related functionals in exchangeable partition models. The reliance on established urn-scheme results from Karlin and Durieu-Wang is a strength, as it keeps the new contribution focused and verifiable once the application is checked.
major comments (1)
- [§3] §3, proof of the main quenched functional CLT: the verification that the random sequence (P_j) satisfies the moment and tail conditions of Durieu-Wang (2016) for almost-sure functional convergence is only sketched; an explicit check that the α-diversity is measurable with respect to the sigma-field generated by the limiting Gaussian process is needed to justify the claimed conditional independence.
minor comments (2)
- [Introduction] The definition of the α-diversity appears first in the abstract and introduction but is not restated with its explicit almost-sure limit expression before it is used as the conditioning variable in the main theorem; adding one sentence would improve readability.
- [Theorem 1.1] In the statement of the main theorem, the topology on the space of cadlag paths (Skorokhod or uniform) should be specified explicitly rather than left implicit.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and the constructive comment on the proof in §3. We address the point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [§3] §3, proof of the main quenched functional CLT: the verification that the random sequence (P_j) satisfies the moment and tail conditions of Durieu-Wang (2016) for almost-sure functional convergence is only sketched; an explicit check that the α-diversity is measurable with respect to the sigma-field generated by the limiting Gaussian process is needed to justify the claimed conditional independence.
Authors: We agree that the verification in the proof of the quenched functional CLT can be made more explicit. In the revised manuscript we will expand the argument to include a direct check that the random sequence (P_j) satisfies the moment and tail conditions of Durieu-Wang (2016) for almost-sure functional convergence. We will also add an explicit measurability argument showing that the α-diversity is measurable with respect to the sigma-field generated by the limiting Gaussian process, thereby rigorously justifying the claimed conditional independence of the two fluctuation sources given the α-diversity. revision: yes
Circularity Check
Minor self-citation to prior occupancy results; derivation applies independent theorems without reduction to fitted inputs
full rationale
The paper invokes the occupancy-count representation of the component count (equivalent in law to an infinite-urn scheme with frequencies P_j) and applies limit theorems from Karlin (1967) and Durieu-Wang (2016) to decompose fluctuations into a conditional sampling term and a term from the random frequencies, with the two becoming conditionally independent given the alpha-diversity in the quenched limit. This decomposition follows directly from the cited external results once the representation is adopted; no equation in the present work defines the target CLT in terms of its own fitted parameters or self-referential ansatz, and the single self-citation is not load-bearing for the new quenched functional statement.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Convergence results for occupancy counts in infinite urn schemes from Karlin (1967) and Durieu and Wang (2016)
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the fluctuations of component count consist of two parts, one due to the sampling effect given the asymptotic frequencies (P_j), the other due to the fluctuations of the random asymptotic frequencies, and in the limit the fluctuations of two parts are conditionally independent given the α-diversity
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat recovery unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our analysis follows this approach and is based on earlier results of Karlin (1967) and Durieu and Wang (2016)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Arratia, R., Barbour, A. D., and Tavar´ e, S. (2003).Logarithmic combinatorial structures: a probabilistic approach. EMS Monographs in Mathematics. European Mathematical Society (EMS), Z¨ urich
work page 2003
-
[2]
Bahadur, R. R. (1960). On the number of distinct values in a large sample from an infinite discrete distribution.Proc. Nat. Inst. Sci. India Part A, 26(supplement II):67–75
work page 1960
-
[3]
Bahier, V. and Najnudel, J. (2022). On smooth mesoscopic linear statistics of the eigen- values of random permutation matrices.J. Theoret. Probab., 35(3):1640–1661
work page 2022
- [4]
-
[5]
Ben Arous, G. and Dang, K. (2015). On fluctuations of eigenvalues of random permuta- tion matrices.Ann. Inst. Henri Poincar´ e Probab. Stat., 51(2):620–647
work page 2015
-
[6]
Bercu, B. and Favaro, S. (2024). A martingale approach to Gaussian fluctuations and laws of iterated logarithm for Ewens-Pitman model.Stochastic Process. Appl., 178:Paper No. 104493, 19
work page 2024
-
[7]
(1999).Convergence of probability measures
Billingsley, P. (1999).Convergence of probability measures. Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons Inc., New York, second edition. A Wiley-Interscience Publication
work page 1999
-
[8]
Broderick, T., Jordan, M. I., and Pitman, J. (2012). Beta processes, stick-breaking and power laws.Bayesian Anal., 7(2):439–475
work page 2012
-
[9]
Chebunin, M. and Kovalevskii, A. (2016). Functional central limit theorems for certain statistics in an infinite urn scheme.Statist. Probab. Lett., 119:344–348
work page 2016
-
[10]
Contardi, C., Dolera, E., and Favaro, S. (2025). Laws of large numbers and central limit theorem for Ewens-Pitman model.Electron. J. Probab., 30:Paper No. 193, 51
work page 2025
-
[11]
Crane, H. (2016). The ubiquitous Ewens sampling formula.Statist. Sci., 31(1):1–19
work page 2016
-
[12]
Darling, D. A. (1967). Some limit theorems associated with multinomial trials. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. II: Contributions to Probability Theory, Part 1, pages 345–350. Univ. California Press, Berkeley, CA. 20 YIZAO W ANG
work page 1967
-
[13]
Durieu, O. and Wang, Y. (2016). From infinite urn schemes to decompositions of self- similar Gaussian processes.Electron. J. Probab., 21:Paper No. 43, 23
work page 2016
-
[14]
(2010).Probability: theory and examples
Durrett, R. (2010).Probability: theory and examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, fourth edition
work page 2010
- [15]
-
[16]
(2010).The Poisson-Dirichlet distribution and related topics
Feng, S. (2010).The Poisson-Dirichlet distribution and related topics. Probability and its Applications (New York). Springer, Heidelberg. Models and asymptotic behaviors
work page 2010
-
[17]
Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems.Ann. Statist., 1:209–230
work page 1973
-
[18]
Fran¸ cois, Q. (2025). Characteristic polynomial of generalized Ewens random permuta- tions.Electron. Commun. Probab., 30:Paper No. 97, 12
work page 2025
-
[19]
Fu, Z. and Wang, Y. (2020). Stable processes with stationary increments parameterized by metric spaces.J. Theoret. Probab., 33(3):1737–1754
work page 2020
-
[20]
Garza, J. and Wang, Y. (2024). Limit theorems for random permutations induced by Chinese restaurant processes. Arxiv preprint,https://arxiv.org/abs/2412.02162
-
[21]
Garza, J. and Wang, Y. (2025). A functional central limit theorem for weighted occu- pancy processes of the Karlin model.Stochastic Process. Appl., 188:Paper No. 104665
work page 2025
-
[22]
Gnedin, A., Hansen, B., and Pitman, J. (2007). Notes on the occupancy problem with infinitely many boxes: general asymptotics and power laws.Probab. Surv., 4:146–171
work page 2007
-
[23]
Gnedin, A. and Iksanov, A. (2012). Regenerative compositions in the case of slow variation: a renewal theory approach.Electron. J. Probab., 17:no. 77, 19
work page 2012
-
[24]
Gnedin, A., Iksanov, A., and Marynych, A. (2010). Limit theorems for the number of occupied boxes in the Bernoulli sieve.Theory Stoch. Process., 16(2):44–57
work page 2010
-
[25]
Gr¨ ubel, R. and Kabluchko, Z. (2016). A functional central limit theorem for branching random walks, almost sure weak convergence and applications to random trees.Ann. Appl. Probab., 26(6):3659–3698
work page 2016
-
[26]
Heyde, C. C. (1977). On central limit and iterated logarithm supplements to the mar- tingale convergence theorem.J. Appl. Probability, 14(4):758–775
work page 1977
-
[27]
Iksanov, A., Kabluchko, Z., and Kotelnikova, V. (2022). A functional limit theorem for nested Karlin’s occupancy scheme generated by discrete Weibull-like distributions.J. Math. Anal. Appl., 507(2):Paper No. 125798, 24
work page 2022
-
[28]
Iksanov, A., Marynych, A., and Meiners, M. (2017). Asymptotics of random processes with immigration I: Scaling limits.Bernoulli, 23(2):1233–1278
work page 2017
-
[29]
Karlin, S. (1967). Central limit theorems for certain infinite urn schemes.J. Math. Mech., 17:373–401
work page 1967
-
[30]
Kingman, J. F. C. (1978). The representation of partition structures.J. London Math. Soc. (2), 18(2):374–380
work page 1978
-
[31]
Perman, M., Pitman, J., and Yor, M. (1992). Size-biased sampling of Poisson point processes and excursions.Probab. Theory Related Fields, 92(1):21–39
work page 1992
-
[32]
(2006).Combinatorial stochastic processes, volume 1875 ofLecture Notes in Mathematics
Pitman, J. (2006).Combinatorial stochastic processes, volume 1875 ofLecture Notes in Mathematics. Springer-Verlag, Berlin. Lectures from the 32nd Summer School on Proba- bility Theory held in Saint-Flour, July 7–24, 2002, With a foreword by Jean Picard. ON CENTRAL LIMIT THEOREMS FOR EWENS–PITMAN MODEL 21
work page 2006
-
[33]
Pitman, J. and Yor, M. (1997). The two-parameter Poisson-Dirichlet distribution de- rived from a stable subordinator.Ann. Probab., 25(2):855–900
work page 1997
-
[34]
van der Vaart, A. W. and Wellner, J. A. (1996).Weak convergence and empirical processes: with applications to statistics. Springer Series in Statistics. Springer-Verlag, New York
work page 1996
-
[35]
Wieand, K. (2000). Eigenvalue distributions of random permutation matrices.Ann. Probab., 28(4):1563–1587. Department of Mathematical Sciences, University of Cincinnati, 2815 Commons W ay, Cincinnati, OH, 45221-0025, USA. Email address:yizao.wang@uc.edu
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.