pith. sign in

arxiv: 2509.17201 · v2 · submitted 2025-09-21 · 🧮 math.PR · math.CO

On a Conjecture on Uniform Group Drawings in the Coupon Collector Problem

Pith reviewed 2026-05-18 14:35 UTC · model grok-4.3

classification 🧮 math.PR math.CO
keywords coupon collector's problemgroup drawingsuniform distributionschilling's conjectureexpected collection timenon-uniform distributionsmarkov chain absorption
0
0 comments X

The pith

Non-uniform distributions over s-subsets yield strictly lower expected collection times than the uniform distribution for 2 ≤ s ≤ n-2.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves Schilling's conjecture that the uniform distribution over all possible packages of s coupons does not minimize the expected number of rounds required to collect all n coupons in the generalized coupon collector problem. It establishes this by exhibiting explicit non-uniform distributions on the binomial(n,s) packages that achieve strictly smaller expected times and by supplying closed-form expressions for the expected number of rounds under those distributions and related ones. A reader cares because the result shows that the default choice of uniform probabilities is suboptimal in this setting and that better collection strategies are available when the distribution can be chosen freely.

Core claim

Schilling's conjecture holds in full: for 2 ≤ s ≤ n-2 the uniform distribution over the binomial(n,s) possible packages does not minimize the expected collection time. Natural non-uniform distributions over these packages produce strictly lower expected times, and explicit formulas are derived for the expected number of rounds under the proposed distributions and related families.

What carries the argument

The Markov chain on the power set of collected coupons whose absorption time is computed under a fixed distribution on the binomial(n,s) packages, together with the explicit non-uniform distributions that lower this absorption time.

If this is right

  • The uniform distribution is optimal only for the boundary cases s = n-1 and the classical single-coupon case.
  • Expected collection times can be obtained exactly from the supplied formulas rather than through simulation or bounds.
  • The same construction technique applies to related families of distributions over packages.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • In applications such as biological network sampling, deliberately biasing the choice of which s-subsets to draw could shorten the expected time to full coverage.
  • The explicit constructions supply concrete lower bounds that any candidate optimal distribution must beat.
  • The approach may extend to variants in which the package size s itself is chosen randomly or adaptively.

Load-bearing premise

Each draw is chosen independently from the same fixed probability distribution over the possible packages of size s.

What would settle it

For concrete small values such as n=4 and s=2, compute the exact expected collection time under the uniform distribution and under one of the paper's non-uniform distributions; the conjecture would be falsified if the non-uniform time is not strictly smaller.

Figures

Figures reproduced from arXiv: 2509.17201 by Daniel Berend, Tomer Sher.

Figure 1
Figure 1. Figure 1: Ten coupons along a circle 5 [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: g(x) for c = 0.01, 0.5, 0.99 Theorem 2.10. Let s = c · n, where 0 < c < 1 is a constant, and let α = α(n) = {log1/(1−c) n}. As n −→ ∞, we have E[Y ] = ⌊log1/(1−c) n⌋ + g(α(n)) + o(1). (3) The first term on the right-hand side of (3) is the main term. The other two jointly may be replaced by O(1), but give more information as they are written. Notice that g grows by 1 as we move along the interval [0, 1]. T… view at source ↗
read the original abstract

We address a conjecture of Schilling concerning the optimality of the uniform distribution in the generalized Coupon Collector's Problem (CCP) where, in each round, a subset (package) of $s$ coupons is drawn from a total of $n$ distinct coupons. While the classical CCP (with single-coupon draws) is well understood, the group-draw variant, where packages of size $s$ are drawn, presents new challenges and has applications in areas such as biological network models. Consider the set of all distributions over the collection of $\binom{n}{s}$ packages of size $s$. Schilling showed that, for $s=n-1$, the uniform distribution yields the minimal expected time for collecting all coupons. She further conjectured that, for $2\le s\le n-2$, the uniform distribution does not yield the minimum. We prove Schilling's conjecture in full by presenting "natural" non-uniform distributions yielding strictly lower expected collection times. Explicit formulas are provided for the expected number of rounds under these and related distributions Keywords: Coupon Collector's Problem, Group Drawings, Uniform Distribution, Expected Collection Time, Schilling's Conjecture, Optimal Distribution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proves Schilling's conjecture that the uniform distribution over the binom(n,s) possible s-subsets is not optimal for the expected collection time in the generalized coupon collector problem when 2 ≤ s ≤ n-2. It constructs explicit 'natural' non-uniform distributions on these packages, derives closed-form expressions for the resulting expected absorption times in the power-set Markov chain, and shows that these expectations are strictly smaller than the uniform case.

Significance. Resolving the conjecture with explicit formulas for the expected times under the new distributions is a clear contribution to the theory of group-draw coupon collector problems. The direct construction and exact solution of the linear system for E[T] avoid parameter fitting and provide verifiable, falsifiable expressions that can be checked for small n and s.

major comments (2)
  1. [§4] §4, the closed-form expression for E[T] under the constructed distribution: the reduction of the 2^n-state linear system to a solvable subsystem is presented via symmetry, but the argument that this symmetry holds for arbitrary n and s (including when gcd(s,n)>1) is only indicated rather than fully expanded; this step is load-bearing for the explicit formula.
  2. [§5] The validity check that the proposed weights on the binom(n,s) packages sum to 1 and are non-negative is given for the base construction but not re-verified after the 'related distributions' mentioned in the abstract; a short appendix computation for n=5,s=2 would confirm the inequality remains strict.
minor comments (2)
  1. [§2] The Markov chain setup in §2 invokes the standard absorption-time equations but does not explicitly write the transition matrix for the non-uniform case; adding the general entry p_{A,B} would improve readability.
  2. [§3] Notation for the state space (power set of [n]) is introduced without a small running example; inserting the n=4,s=2 case alongside the general formulas would help readers verify the strict inequality numerically.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading of our manuscript and the constructive comments. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation.

read point-by-point responses
  1. Referee: [§4] §4, the closed-form expression for E[T] under the constructed distribution: the reduction of the 2^n-state linear system to a solvable subsystem is presented via symmetry, but the argument that this symmetry holds for arbitrary n and s (including when gcd(s,n)>1) is only indicated rather than fully expanded; this step is load-bearing for the explicit formula.

    Authors: We agree that the symmetry argument in §4 would benefit from a fuller exposition to cover all parameter regimes. In the revised version we will expand this section with a self-contained proof that the relevant symmetry (invariance under the action of the cyclic group generated by a fixed permutation of the coupons) persists for arbitrary n and s, including the case gcd(s,n)>1. The argument will explicitly identify the invariant subspace of the state space and verify that the reduced linear system remains solvable in closed form. revision: yes

  2. Referee: [§5] The validity check that the proposed weights on the binom(n,s) packages sum to 1 and are non-negative is given for the base construction but not re-verified after the 'related distributions' mentioned in the abstract; a short appendix computation for n=5,s=2 would confirm the inequality remains strict.

    Authors: We thank the referee for pointing this out. Although the base construction is verified in the main text, we will add a short appendix containing the explicit weight computation for n=5, s=2 under each of the related distributions. This will confirm that the weights remain non-negative and sum to one, and that the resulting expected collection time is still strictly smaller than the uniform case. revision: yes

Circularity Check

0 steps flagged

Direct construction and explicit formulas yield self-contained proof

full rationale

The paper establishes Schilling's conjecture via explicit non-uniform distributions over the binomial(n,s) packages together with closed-form solutions for the expected absorption time in the standard power-set Markov chain. All derivations proceed from the independent-draw model and the linear system for E[T] without fitting parameters to subsets of results, without invoking self-citations as load-bearing premises, and without renaming or smuggling ansatzes. The central inequality is obtained by direct comparison of the constructed expectations against the uniform case, rendering the argument independent of prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard discrete probability and Markov chain absorption times for the coupon collector state space. No free parameters are introduced or fitted; the constructions are combinatorial.

axioms (1)
  • standard math The process is a Markov chain on subsets of coupons with absorption at the full set, and expected time satisfies the standard linear system.
    Invoked to define and compare expected collection times under different package distributions.

pith-pipeline@v0.9.0 · 5735 in / 1166 out tokens · 58642 ms · 2026-05-18T14:35:07.921006+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    Alon and J

    N. Alon and J. H. Spencer.The Probabilistic Method. John Wiley & Sons, Inc., Hoboken, NJ, 3rd edition, 2008

  2. [2]

    Boneh and M

    A. Boneh and M. Hofri. The coupon-collector problem revisited—a survey of engineer- ing problems and computational methods.Communications in Statistics. Stochastic Models, 13(1):39–66, 1997

  3. [3]

    R. J. Caron, M. Hlynka, and J. F. McDonald. On the best case performance of hit and run methods for detecting necessary constraints.Mathematical Programming, 54:233– 249, 1992

  4. [4]

    Chewi.Discrete Mathematics and Probability Theory

    S. Chewi.Discrete Mathematics and Probability Theory. University of California, Berkeley, 2016

  5. [5]

    John D. Cook. Consecutive coupon collector problem. Blog post, 2023. Accessed: 2025-05-09

  6. [6]

    Erd˝ os and A

    P. Erd˝ os and A. R´ enyi. On a classical problem of probability theory.A Magyar Tudom´ anyos Akad´ emia Matematikai Kutat´ o Int´ ezet´ enek K¨ ozlem´ enyei, 6(1-2):215–220, 1961

  7. [7]

    Fang and E

    C. Fang and E. Chang. Optimal strategy of coupon subset collection when each package contains half of the coupons.Information Processing Letters, 114(12):703–705, 2014

  8. [8]

    Giannakis, J

    K. Giannakis, J. M. Chustecki, and I. G. Johnston. Exchange on dynamic encounter networks allows plant mitochondria to collect complete sets of mitochondrial dna prod- ucts despite their incomplete genomes.Quantitative Plant Biology, 3:e18, 2022

  9. [9]

    E. J. Gumbel.Statistical Theory of Extreme Values and Some Practical Applications. A Series of Lectures. National Bureau of Standards Applied Mathematics Series No. 33, U. S. Government Printing Office, Washington, D. C., 1954

  10. [10]

    Jukna.Extremal Combinatorics: With Applications in Computer Science

    S. Jukna.Extremal Combinatorics: With Applications in Computer Science. Texts in Theoretical Computer Science. An EATCS Series. Springer, Berlin, Heidelberg, 2nd edition, 2011

  11. [11]

    E. C. Pinheiro and S. L. P. Ferrari. A comparative review of generalizations of the gumbel extreme value distribution with an application to wind speed data.Journal of Statistical Computation and Simulation, 86(11):2241–2261, 2016

  12. [12]

    Schilling

    J. Schilling. Results and conjectures on the role of the uniform distribution in the coupon collector’s problem with group drawings.Information Processing Letters, 173:106–112, 2021

  13. [13]

    Schilling and N

    J. Schilling and N. Henze. Two Poisson limit theorems for the coupon collector’s problem with group drawings.Journal of Applied Probability, 58(4):966–977, 2021

  14. [14]

    H. Solomon. Covering a circle circumference and a sphere surface. InGeometric Prob- ability, pages 75–96. SIAM, Philadelphia, PA, 1978

  15. [15]

    W. L. Stevens. Solution to a geometrical problem in probability.Annals of Eugenics, 9(4):315–320, 1939. 23

  16. [16]

    Todhunter.A History of the Mathematical Theory of Probability from the Time of Pascal to That of Laplace

    I. Todhunter.A History of the Mathematical Theory of Probability from the Time of Pascal to That of Laplace. Macmillan and Co., 1865

  17. [17]

    E. W. Weisstein. Circle covering by arcs. From MathWorld–A Wolfram Web Resource. Accessed: 2025-05-09

  18. [18]

    Coupon collector’s problem variants with group drawings

    Possibly Wrong. Coupon collector’s problem variants with group drawings. Blog post,

  19. [19]

    Accessed: 2025-05-09. 24