pith. sign in

arxiv: 1906.11012 · v1 · pith:JMYIGYPJnew · submitted 2019-06-26 · 🧮 math.PR · cs.DM· math.CO

The impatient collector

Pith reviewed 2026-05-25 15:21 UTC · model grok-4.3

classification 🧮 math.PR cs.DMmath.CO
keywords coupon collectorcompletion curveconditioned processrare eventsasymptotic shapeaccessible automataKoršunov formula
0
0 comments X

The pith

The coupon collector conditioned on completing in at most (1+ν)n trials follows a new limiting completion curve ζ(ν,·).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

In the coupon collector problem, T_n is typically around n log n and the fraction of items collected by time nt follows the curve 1 - e^{-t}. The paper examines the atypical case where the full collection finishes in at most (1+ν)n trials for fixed ν > 0. Under this conditioning the completion curve converges to a different limiting shape denoted ζ(ν,·). The result is then used to re-derive a formula for the asymptotic count of complete accessible automata originally obtained by Koršunov.

Core claim

For ν > 0 the asymptotic shape ζ(ν,·) of the completion curve under the condition T_n ≤ (1+ν)n exists and can be characterized. This shape describes the typical trajectory of the number of distinct coupons obtained after nt trials when the collection finishes in the rare fast regime.

What carries the argument

The conditioned coupon collector process, whose completion curve ζ(ν,·) tracks the fraction of distinct items collected by time nt under the fast-completion event.

If this is right

  • The fraction of items collected by time nt, given T_n ≤ (1+ν)n, converges to ζ(ν,t) instead of the unconditional 1 - e^{-t}.
  • The same conditioning supplies a new derivation of Koršunov's formula counting complete accessible automata.
  • The result gives the typical path followed by the collector in the large-deviation regime of unusually rapid completion.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • An explicit formula for ζ(ν,·) would allow direct computation of other conditioned statistics such as the distribution of duplicate counts along fast paths.
  • The conditioning technique may transfer to other hitting-time problems for Markov chains on large state spaces.
  • Finite-n simulations for moderate ν could be compared directly against the predicted curve to locate the onset of the asymptotic regime.

Load-bearing premise

An asymptotic completion curve ζ(ν,·) exists and can be characterized for the coupon collector process conditioned on T_n ≤ (1+ν)n.

What would settle it

Large-scale Monte Carlo sampling of coupon-collector trajectories conditioned on T_n ≤ (1+ν)n for large n, showing that the empirical average of the fraction collected by time nt does not converge to the claimed ζ(ν,t).

Figures

Figures reproduced from arXiv: 1906.11012 by Anis Amri (IECL), Philippe Chassaing (IECL).

Figure 1
Figure 1. Figure 1: On the left: a CDTS D. On the right: the ordering of vertices and edges inherited from the breadth-first search. × × × × × × × × × × × × × 1 2 3 4 5 6 7 8 9 10 11 12 13 [PITH_FULL_IMAGE:figures/full_fig_p016_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The boxed diagram of the CDTS D (in pink, the minimal completion curve for a 3-Dyck boxed diagram). the left, would produce the same labeling as is pictured on the left. Note that the correspondance edge-endpoint is a surjection ˜ω from [[1, 13]] to [[1, 4]], with a special property: the partition PD = (˜ω −1 (1), ω˜ −1 (2), ω˜ −1 (3), ω˜ −1 (4)), here e.g. PD = {{1, 4, 7}, {2, 5, 8, 12}, {3, 9, 11}, {6, 1… view at source ↗
Figure 3
Figure 3. Figure 3: Forbidden zones. This probabilistic formulation of Korˇsunov’s formula hints at the relation with Theorem 3 : as [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The graph of the Lindsey process. that P  Υ˜  is the stationary distribution π0 at 0 of the Lindsey process with step µk. For ` ≥ 1, let t` denote the average time needed by the Lindsey process (or, indifferently, by the random walk S) to hit 0 starting from position `: Wald’s identity gives that t` = − ` dk . On the other hand, if t0 is the expected time of the first return to 0, starting from 0, then, … view at source ↗
Figure 5
Figure 5. Figure 5: Locations of the solutions. Proof. From (47), we obtain the differential equation for λ(x) : ζ(x) = x 1 + λ(x) , ∂λ ∂x = ζ(x) − xζ0 (x) ζ(x) 2 = 1 − (1 + λ(x)) ζ 0 (x) x (1 + λ) ∂λ ∂x = 1 − (1 + λ) e −ξ x (1 + λ) = 2v(x) x . (50) Lower bounds. Relations (50) and (46) yields that λ 0 λ ≥ 1 x , thus, for 0 < x ≤ x0, λ (x) ≤ λ (x0) x x0 = x y0 − x x0 , (51) leading to the lower bounds of (49). Upper bounds. W… view at source ↗
Figure 6
Figure 6. Figure 6: Computations: 46 ≥ 45.4. Thus we just proved that [PITH_FULL_IMAGE:figures/full_fig_p041_6.png] view at source ↗
read the original abstract

In the coupon collector problem with $n$ items, the collector needs a random number of tries $T_n\simeq n\ln n$ to complete the collection. Also, after $nt$ tries, the collector has secured approximately a fraction $\zeta_\infty(t)=1-e^{-t}$ of the complete collection, so we call $\zeta_\infty$ the (asymptotic) \emph{completion curve}. In this paper, for $\nu>0$, we address the asymptotic shape $\zeta (\nu,.) $ of the completion curve under the condition $T_n\leq \left( 1+\nu \right) n$, i.e. assuming that the collection is \emph{completed unlikely fast}. As an application to the asymptotic study of complete accessible automata, we provide a new derivation of a formula due to Kor\v{s}unov.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript studies the coupon collector problem conditioned on the rare event T_n ≤ (1+ν)n for ν>0. It claims to establish the existence and characterize the asymptotic shape ζ(ν,.) of the completion curve (the fraction of the collection obtained after nt steps under this conditioning), and applies the result to give a new derivation of Koršunov's formula for the asymptotic enumeration of complete accessible automata.

Significance. If the central limit and characterization hold with the claimed rigor, the work would supply a concrete description of a conditioned coupon-collector trajectory and an independent derivation of a known automata-theoretic formula. The absence of free parameters or ad-hoc axioms in the ledger is a positive feature, but the abstract supplies no derivation steps, error bounds, or verification that the limit exists, so the significance cannot yet be assessed.

major comments (1)
  1. [Abstract] Abstract, paragraph 2: the claim that an asymptotic completion curve ζ(ν,.) 'exists and can be characterized' is stated without any indication of the large-deviation or conditioning argument used to obtain it; the reader's report correctly notes that no derivation, error analysis, or existence proof is visible, rendering the central claim unverifiable from the supplied text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for highlighting the need for greater clarity in the abstract regarding the derivation of the asymptotic completion curve. The full manuscript contains a rigorous large-deviation analysis establishing existence and characterization; the abstract is a high-level summary. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract, paragraph 2: the claim that an asymptotic completion curve ζ(ν,.) 'exists and can be characterized' is stated without any indication of the large-deviation or conditioning argument used to obtain it; the reader's report correctly notes that no derivation, error analysis, or existence proof is visible, rendering the central claim unverifiable from the supplied text.

    Authors: The abstract summarizes the main result without technical details, as is standard. The existence of the limit ζ(ν,·) and its explicit characterization are proved in Section 3 via a large-deviation principle for the empirical measure of the coupon-collector process conditioned on {T_n ≤ (1+ν)n}. The proof proceeds by establishing tightness of the conditioned trajectory, identifying the rate function, and solving the associated variational problem; error bounds appear in Theorem 3.2. The application to Koršunov’s formula is derived in Section 4 from the same limit. We agree that a brief indication of the large-deviation approach in the abstract would improve readability and will add one sentence to that effect. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation presented as independent

full rationale

The supplied abstract and context contain no equations, fitted parameters, self-citations, or derivation steps that reduce to their own inputs by construction. The central claim is an asymptotic characterization of the conditioned completion curve ζ(ν,·) for the coupon collector process, presented as a new derivation of an external formula (Koršunov). No load-bearing step is visible that would qualify under any of the enumerated circularity patterns. The result is therefore treated as self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms, or invented entities. No elements can be identified.

pith-pipeline@v0.9.0 · 5674 in / 1132 out tokens · 26498 ms · 2026-05-25T15:21:32.595665+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

  1. [1]

    51, Springer-Verlag, New York, 2003, Stochastic Modelling and Applied Probability

    S\"oren Asmussen, Applied probability and queues, second ed., Applications of Mathematics (New York), vol. 51, Springer-Verlag, New York, 2003, Stochastic Modelling and Applied Probability. 1978607

  2. [2]

    1709, Springer, Berlin, 1999, pp

    Michel Bena\" i m, Dynamics of stochastic approximation algorithms, S\' e minaire de P robabilit\' e s, XXXIII , Lecture Notes in Math., vol. 1709, Springer, Berlin, 1999, pp. 1--68. 1767993

  3. [3]

    S. D. Conte, Elementary numerical analysis: A n algorithmic approach , McGraw-Hill Book Co., New York-Toronto, Ont.-London, 1965. 0202267

  4. [4]

    34, Springer-Verlag, Berlin, 1997, Translated from the 1990 French original by Stephen S

    Marie Duflo, Random iterative models, Applications of Mathematics (New York), vol. 34, Springer-Verlag, Berlin, 1997, Translated from the 1990 French original by Stephen S. Wilson and revised by the author. 1485774

  5. [5]

    Philippe Flajolet and Robert Sedgewick, Analytic combinatorics, Cambridge University Press, Cambridge, 2009. 2483235

  6. [6]

    I. J. Good, An asymptotic formula for the differences of the powers at zero, Ann. Math. Statist. 32 (1961), 249--256. 0120204

  7. [7]

    A. D. Kor s unov, Enumeration of finite automata, Problemy Kibernet. (1978), no. 34, 5--82, 272. 517814

  8. [8]

    Informationsverarb

    , On the number of nonisomorphic strongly connected finite automata, Elektron. Informationsverarb. Kybernet. 22 (1986), no. 9, 459--462. 862029

  9. [9]

    V. F. Kolchin, B. A. Sevastyanov, and V. P. Chistyakov, Random allocations, John Wiley & Sons, 1978, Scripta Series in Mathematics. 0471015

  10. [10]

    Elcio Lebensztayn, On the asymptotic enumeration of accessible automata, Discrete Math. Theor. Comput. Sci. 12 (2010), no. 3, 75--79. 2786470

  11. [11]

    Lothaire, Applied combinatorics on words, Encyclopedia of Mathematics and its Applications, vol

    M. Lothaire, Applied combinatorics on words, Encyclopedia of Mathematics and its Applications, vol. 105, Cambridge University Press, Cambridge, 2005. 2165687

  12. [12]

    Meir and J

    A. Meir and J. W. Moon, The asymptotic behaviour of coefficients of powers of certain generating functions, European J. Combin. 11 (1990), no. 6, 581--587. 1078714

  13. [13]

    thesis, Paris VII, 2000

    Cyril Nicaud, É tude du comportement en moyenne des automates finis et des langages rationnels , Ph.D. thesis, Paris VII, 2000

  14. [14]

    Herbert Robbins, A remark on the joint distribution of cumulative sums, Ann. Math. Statistics 25 (1954), 614--616. 0063592