Distribution of outbreak sizes for SIR disease in finite populations
Pith reviewed 2026-05-24 22:43 UTC · model grok-4.3
The pith
An exact expression for the final size distribution of SIR epidemics holds in finite populations for arbitrary transmission distributions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We derive an expression for the final size distribution of an SIR epidemic in a finite population. Our derivation allows arbitrary distributions of the number of transmissions caused by an infected individual. We show how this calculation can be used to infer parameters of the infectious disease through observations in multiple small populations. The inference suffers from some identifiability difficulties, and it requires many observations to distinguish between parameter combinations that correspond to the same reproductive number.
What carries the argument
The exact probability mass function for epidemic final size, computed by accounting for depletion of susceptibles while allowing arbitrary offspring distributions.
If this is right
- The final size probabilities can be calculated exactly without simulation for any chosen transmission distribution.
- Maximum-likelihood estimates of transmission parameters become available from collections of independent small-population outbreaks.
- Different parameter sets that produce the same reproductive number remain distinguishable only when the number of observed outbreaks is large.
- The method applies directly to household or school outbreak data where population size is known and small.
Where Pith is reading between the lines
- The same expression could be used to test whether a given offspring distribution is consistent with data before fitting more complex models.
- In the limit of many small populations the approach may yield tighter bounds on variance of transmission than aggregate data from one large population.
- The identifiability warning implies that reproductive number alone is an insufficient summary statistic when only final sizes are observed.
Load-bearing premise
The process is a standard SIR epidemic in a closed finite population where each infected individual draws its number of transmissions independently from the same fixed distribution.
What would settle it
If the empirical distribution of final outbreak sizes across many small populations deviates from the probabilities predicted by the derived expression for any choice of transmission parameters, the claimed formula would be falsified.
Figures
read the original abstract
We consider the spread of a Susceptible-Infected-Recovered (SIR) disease through finite populations and derive an expression for the final size distribution. Our derivation allows arbitrary distributions of the number of transmissions caused by an infected individual. We show how this calculation can be used to infer parameters of the infectious disease through observations in multiple small populations. The inference suffers from some identifiability difficulties, and it requires many observations to distinguish between parameter combinations that correspond to the same reproductive number.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript derives an exact expression for the final size distribution of an SIR epidemic in a closed finite population, where each infected individual independently draws its number of secondary infections from an arbitrary but fixed distribution. It further shows how this distribution can be used to perform parameter inference from final size observations across multiple small populations, while noting identifiability challenges when distinguishing parameter sets with the same basic reproductive number.
Significance. If the derivation is correct, the result supplies a general, non-Poisson framework for exact final-size probabilities in finite populations that directly supports inference from small-population data. The explicit allowance for arbitrary offspring distributions and the clear statement of the reproductive-number identifiability limitation are both strengths; the work therefore supplies a usable computational tool rather than an approximation.
minor comments (3)
- [Abstract] The abstract states that the derivation 'allows arbitrary distributions' but does not indicate whether the final expression is given in closed form, as a recursion, or via generating functions; a single sentence clarifying the computational representation would help readers.
- The inference section would benefit from an explicit statement of the likelihood function or the numerical procedure used to obtain posterior distributions over parameters, even if only in a short paragraph or appendix.
- Notation for the offspring distribution (e.g., p_k versus the probability generating function) should be introduced once and used consistently in all subsequent equations and figures.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our manuscript and for recommending minor revision. The referee accurately captures the main contributions, including the exact final-size distribution for arbitrary offspring distributions and the identifiability issues for inference when reproductive numbers coincide. No specific major comments were provided in the report.
Circularity Check
No significant circularity; derivation is independent first-principles result
full rationale
The paper derives the final-size distribution for a standard SIR process in closed finite populations where each infected individual draws transmissions independently from an arbitrary fixed distribution. This is a direct combinatorial/probabilistic calculation from the model definition, with no reduction of the claimed distribution to a fitted quantity, no self-citation load-bearing the central result, and no ansatz or uniqueness theorem imported from prior author work. The inference application is presented with explicit identifiability caveats rather than as a prediction forced by construction. The derivation is therefore self-contained against the stated modeling assumptions.
Axiom & Free-Parameter Ledger
free parameters (1)
- reproductive number
axioms (1)
- domain assumption SIR process in closed finite populations with independent transmissions drawn from a fixed arbitrary distribution
Reference graph
Works this paper leans on
-
[1]
Frank Ball. A unified approach to the distribution of total size and total area under the trajectory of infectives in epidemic models. Advances in Applied Probability, 18(2):289– 310, 1986. 20
work page 1986
-
[2]
Implementation and applications of EMOD, an individual- based multi-disease modeling platform
Anna Bershteyn, Jaline Gerardin, Daniel Bridenbecker, Christopher W Lorton, Jonathan Bloedow, Robert S Baker, Guillaume Chabot-Couture, Ye Chen, Thomas Fischle, Kurt Frey, et al. Implementation and applications of EMOD, an individual- based multi-disease modeling platform. Pathogens and disease, 76(5):fty059, 2018
work page 2018
-
[3]
G. Chowell, M.A. Miller, and C. Viboud. Seasonal influenza in the united states, france, and australia: transmission and prospects for control. Epidemiology & Infection, 136(6):852–864, 2008
work page 2008
-
[4]
R. Durrett. Random graph dynamics. Cambridge University Press, 2007
work page 2007
-
[5]
Connectivity of inhomogeneous random K-out graphs
Rashad Eletreby and Osman Ya˘ gan. Connectivity of inhomogeneous random k-out graphs. arXiv preprint arXiv:1810.09921 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[6]
Cambridge University Press, 2016
Alan Frieze and Micha l Karo´ nski.Introduction to random graphs. Cambridge University Press, 2016
work page 2016
-
[7]
Mitigation strategies for pandemic influenza in the united states
Timothy C Germann, Kai Kadau, Ira M Longini, and Catherine A Macken. Mitigation strategies for pandemic influenza in the united states. Proceedings of the National Academy of Sciences, 103(15):5935–5940, 2006
work page 2006
-
[8]
Travelling waves and spatial hierarchies in measles epidemics
Bryan T Grenfell, Ottar N Bjørnstad, and Jens Kappey. Travelling waves and spatial hierarchies in measles epidemics. Nature, 414(6865):716, 2001
work page 2001
-
[9]
M. B. Hastings. Systematic series expansions for processes on networks. Physical Review Letters, 96(14):148701, 2006
work page 2006
-
[10]
A first course in Bayesian statistical methods
Peter D Hoff. A first course in Bayesian statistical methods. Springer Science & Business Media, 2009
work page 2009
-
[11]
Plotting a set of functions using a ‘violin-plot’ style plot in python
ImportanceOfBeingErnest (https://stackoverflow.com/users/4124317/importanceofbeingernest). Plotting a set of functions using a ‘violin-plot’ style plot in python. Stack Overflow. URL:https://stackoverflow.com/a/55886832/2966723 (version: 2019-04-28)
-
[12]
Eben Kenah and Joel C. Miller. Epidemic percolation networks, epidemic outcomes, and interventions. Interdisciplinary Perspectives on Infectious Diseases , 2011
work page 2011
-
[13]
Eben Kenah and James M. Robins. Second look at the spread of epidemics on networks. Physical Review E, 76(3):036113, 2007
work page 2007
-
[14]
Istvan Z Kiss, Joel C Miller, and P´ eter L Simon.Mathematics of epidemics on networks: from exact to approximate models . IAM. Springer, 2017
work page 2017
-
[15]
Joseph A Lewnard, Martial L Ndeffo Mbah, Jorge A Alfaro-Murillo, Frederick L Altice, Luke Bawo, Tolbert G Nyenswah, and Alison P Galvani. Dynamics and control of ebola virus transmission in montserrado, liberia: a mathematical modelling analysis. The Lancet Infectious Diseases, 14(12):1189–1195, 2014. 21
work page 2014
-
[16]
Superspreading and the effect of individual variation on disease emergence
James O Lloyd-Smith, Sebastian J Schreiber, P Ekkehard Kopp, and Wayne M Getz. Superspreading and the effect of individual variation on disease emergence. Nature, 438(7066):355, 2005
work page 2005
-
[17]
Generality of the final size formula for an epidemic of a newly invading infectious disease
Junling Ma and David JD Earn. Generality of the final size formula for an epidemic of a newly invading infectious disease. Bulletin of mathematical biology , 68(3):679–702, 2006
work page 2006
-
[18]
Lauren Ancel Meyers. Contact network epidemiology: Bond percolation applied to infectious disease prediction and control.Bulletin of the American Mathematical Society, 44(1):63–86, 2007
work page 2007
-
[19]
Lauren Ancel Meyers, Mark Newman, and B. Pourbohloul. Predicting epidemics on directed contact networks. Journal of Theoretical Biology, 240(3):400–418, June 2006
work page 2006
-
[20]
Lauren Ancel Meyers, Babak Pourbohloul, Mark E. J. Newman, Danuta M. Skowronski, and Robert C. Brunham. Network theory and SARS: predicting outbreak diversity. Journal of Theoretical Biology, 232(1):71–81, January 2005
work page 2005
-
[21]
Joel C. Miller. Epidemic size and probability in populations with heterogeneous infec- tivity and susceptibility. Physical Review E, 76(1):010101(R), 2007
work page 2007
-
[22]
Joel C. Miller. Bounding the size and probability of epidemics on networks. Journal of Applied Probability, 45:498–512, 2008
work page 2008
-
[23]
Joel C. Miller. A note on the derivation of epidemic final sizes. Bulletin of Mathematical Biology, 74(9):2125–2141, 2012
work page 2012
-
[24]
Joel C. Miller. A primer on the use of probability generating functions in infectious disease modeling. Infectious Disease Modelling, 3:192–248, 2018
work page 2018
-
[25]
Mark A Miller, Cecile Viboud, Marta Balinska, and Lone Simonsen. The signature fea- tures of influenza pandemics—implications for policy.New England Journal of Medicine, 360(25):2595–2598, 2009
work page 2009
-
[26]
Episims simulation of a multi-component strategy for pandemic influenza
Susan M Mniszewski, Sara Y Del Valle, Phillip D Stroud, Jane M Riese, and Stephen J Sydoriak. Episims simulation of a multi-component strategy for pandemic influenza. In Proceedings of the 2008 Spring simulation multiconference , pages 556–563. Society for Computer Simulation International, 2008
work page 2008
-
[27]
M. E. J. Newman. Spread of epidemic disease on networks. Physical Review E , 66(1):016128, 2002
work page 2002
-
[28]
Bal´ azs R´ ath. A moment-generating formula for Erd˝ os-R´ enyi component sizes.Electronic Communications in Probability, 23, 2018. 22
work page 2018
-
[29]
Superspreading SARS events, beijing, 2003
Zhuang Shen, Fang Ning, Weigong Zhou, Xiong He, Changying Lin, Daniel P Chin, Zonghan Zhu, and Anne Schuchat. Superspreading SARS events, beijing, 2003. Emerg- ing infectious diseases, 10(2):256, 2004. 23
work page 2003
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.