pith. sign in

arxiv: 2606.19405 · v2 · pith:CJPBEUDKnew · submitted 2026-06-17 · 🧬 q-bio.QM · math.DS· q-bio.PE

Multi-type branching inference on contact trees with application to COVID-19

Pith reviewed 2026-06-26 18:12 UTC · model grok-4.3

classification 🧬 q-bio.QM math.DSq-bio.PE
keywords transmission treecontact tracingSIR modellikelihood inferenceCOVID-19branching processmulti-type modelepidemiological parameters
0
0 comments X

The pith

Closed-form ODEs for unobserved clades and sampled-tip densities yield a likelihood for stochastic SIR on contact trees.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives ordinary differential equations that compute the probability a clade remains entirely unobserved and the probability density that the clade produces an observed tip in a given state, under a multi-type branching process on a rooted contact tree where each infected individual is defined by its total effective contacts and already-infected downstream contacts. These equations produce an evaluable likelihood for trees with known tip states and extend to partially resolved trees by integrating over latent internal branching times. The resulting framework is validated on simulated outbreaks, recovering parameters accurately with calibrated uncertainty, and is applied to COVID-19 contact-tracing data from Karnataka, India, to infer both transmission rates and contact heterogeneity without sequence data.

Core claim

The authors obtain closed-form ordinary differential equations for the probability that a clade goes entirely unobserved and for the probability density that it produces an observed tip in a given state under a stochastic SIR process on a rooted contact tree; the resulting likelihood can be evaluated directly for trees with known tip states and extended to partially resolved trees by treating internal branching times as latent variables.

What carries the argument

Closed-form ODEs for the probability of an unobserved clade and the density of a sampled tip within the multi-type SIR branching process on contact trees with per-individual contact counts.

If this is right

  • The likelihood evaluates exactly for any rooted contact tree whose tip states are known.
  • Treating internal branching times as latent variables extends the same likelihood to partially resolved trees.
  • Parameters recover accurately from simulated outbreaks with well-calibrated posterior uncertainty.
  • Application to Karnataka COVID-19 contact-tracing data infers both transmission dynamics and contact-structure heterogeneity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same ODE construction could be reused for contact-tracing datasets of other directly transmitted pathogens.
  • The contact-heterogeneity component could be combined with sequence-based phylodynamic models to use both tree topology and genetic data.
  • Validation on real data would require checking whether reported contact trees deviate systematically from the assumed SIR branching process.

Load-bearing premise

The observed contact tree must accurately represent transmission under the stochastic SIR process in which each individual is characterized by its total effective contacts and already-infected downstream contacts.

What would settle it

Generating data under a homogeneous-mixing SIR process without individual contact counts and then fitting the derived likelihood should produce biased parameter estimates or mis-calibrated uncertainty intervals.

Figures

Figures reproduced from arXiv: 2606.19405 by Augustine Okolie, Eno Akarawakc, Isaac Ajiboye, Johannes M\"uller.

Figure 1
Figure 1. Figure 1: Possible events for (). Left: No infection, lineage recovers unobserved. Middle: No infection, no recovery, lineage remains unobserved. Right: Infection of a new lineage 0, parent lineage + 1 and newborn lineage go unobserved [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Possible events for [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The probability (,) () for a lineage and all of its descendants to go unobserved after time . Gray thick lines: simulated trees for the different types, other points symbols for ODE results respectively - square: 0, (), circle: 1, (), triangle point up: 2, (), plus: 3, (), cross: , (). Choice of parameters: , = 0.5, = 1.5, = 4 (fixed deterministic degree). Then the fixed degree system generalises to the ra… view at source ↗
Figure 4
Figure 4. Figure 4: The probability density (,) [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Sampled tree for Case (3) with = 3 observations: time runs vertically downward. The root individual ((0 , ), time ˆ0) runs as a single vertical stem until its first infection at 2, which occurs at rate ( − 0 ) and advances it to (0 + 1, ). A newborn (type 0, unknown degree, dashed) branches left toward tip 1; its edge uses b0 1 because the newborn’s degree is integrated out. The continuing lineage (type (0… view at source ↗
Figure 6
Figure 6. Figure 6: Left: ˆ 0 and ˆ as functions of the assumed (true = 4 starred). Centre: ˆ 0 as a function of the assumed obs (true value 0.5 marked). Right: theoretical equilibrium frequencies | versus empirical frequencies across 753 simulated trees. Choice of parameters: , = 0.5, = 1.5, = 4 (fixed deterministic degree) [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Resolution sensitivity analysis with joint estim [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Posterior inference for the Karnataka COVID-19 fir [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
read the original abstract

Inferring epidemiological parameters from transmission trees is essential for understanding infectious disease dynamics. Existing tree-based likelihood methods, including the multi-type birth-death models originally applied in phylodynamic settings, provide powerful tools, but most assume homogeneous mixing and rarely capture how transmission potential changes as an individual infects more of their contacts. In this work, we develop a likelihood framework that operates directly on transmission trees, in which nodes are individuals and edges are reported transmission events, with no sequence data involved. We derive a likelihood for a stochastic SIR process on a rooted contact tree in which each infected individual is characterised by the total number of effective contacts, and the number of already infected downstream contacts. We obtain closed-form ordinary differential equations for the probability that a clade goes entirely unobserved and for the probability density that it produces an observed (sampled) tip in a given state. The resulting likelihood can be evaluated for a rooted contact tree with known tip states, and we extend it to partially resolved trees by treating internal branching times as latent variables. Validation on simulated outbreaks confirms accurate parameter recovery and well calibrated uncertainty. Application to empirical COVID-19 contact-tracing data from Karnataka, India, demonstrates the framework's utility for real epidemiological settings. By incorporating contact-degree heterogeneity in a multi-type branching likelihood, our work provides a principled baseline for inferring both transmission dynamics and contact structure from fully or partially resolved transmission trees, complementing rather than relying on sequence-based phylodynamic inference

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper develops a likelihood framework for inferring epidemiological parameters from transmission trees under a stochastic SIR process on rooted contact trees. Each infected individual is characterized by the total number of effective contacts and the number of already infected downstream contacts. Closed-form ODEs are derived for the probability that a clade goes entirely unobserved and for the probability density that it produces an observed (sampled) tip in a given state. The resulting likelihood applies to rooted contact trees with known tip states and is extended to partially resolved trees by integrating over latent internal branching times. Validation on simulated outbreaks shows accurate parameter recovery with well-calibrated uncertainty, and the method is applied to COVID-19 contact-tracing data from Karnataka, India.

Significance. If the derivations hold, the work supplies a computationally tractable multi-type branching-process likelihood that directly incorporates contact-degree heterogeneity into tree-based inference without sequence data. The closed-form ODEs for unobserved-clade and sampling probabilities constitute a technical strength, enabling efficient evaluation on observed or partially resolved trees. Simulation recovery and the Karnataka application illustrate practical utility as a complement to phylodynamic methods for contact-tracing datasets.

minor comments (3)
  1. [Abstract, §2] Abstract and §2: the state-space definition for the multi-type process (total contacts and downstream infections) should be stated explicitly with the resulting dimension of the ODE system to allow readers to assess computational scaling.
  2. [§4] §4 (simulation section): the reported parameter-recovery metrics would benefit from an explicit statement of the prior ranges used in the Bayesian inference and confirmation that the contact-tree topologies were generated under the same SIR process assumed by the likelihood.
  3. [Figure captions, §5] Figure captions and §5 (Karnataka application): axis labels and state definitions (e.g., what constitutes an 'observed tip in a given state') should be clarified so that the plotted posterior densities can be directly compared to the model states.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript, recognition of its technical contributions, and recommendation for minor revision. The provided summary accurately reflects the scope and results. No specific major comments were listed in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The central derivation consists of obtaining closed-form ODEs for the probability a clade is unobserved and the density of producing an observed tip, starting from the multi-type branching process on a contact tree with per-individual contact counts. This follows directly from the standard Kolmogorov forward equations for the process once states are defined by total effective contacts and downstream infections; no parameter is fitted to data and then relabeled as a prediction, no self-citation supplies a uniqueness theorem or ansatz, and the latent-time extension is a standard marginalization step. The framework is therefore self-contained relative to its stated model assumptions and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Insufficient information from abstract only to identify specific free parameters, axioms, or invented entities; no equations or model details provided beyond high-level description.

pith-pipeline@v0.9.1-grok · 5808 in / 1257 out tokens · 35327 ms · 2026-06-26T18:12:23.345913+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 3 canonical work pages

  1. [1]

    Phylodynamics on local sexual contact networks

    David A Rasmussen, Roger Kouyos, Huldrych F Günthard, an d Tanja Stadler. Phylodynamics on local sexual contact networks. PLoS computational biology , 13(3):e1005448, 2017

  2. [2]

    A multi-type branc hing process model for epidemics with application to covid-19

    Arnab Kumar Laha and Sourav Majumdar. A multi-type branc hing process model for epidemics with application to covid-19. Stochastic Environmental Research and Risk Assessment, 37(1):305–325, 2023

  3. [3]

    The impacts of spat ial–temporal heterogeneity of human-to-human contacts on the extinction probability of infectious disea se from branching process model

    Wuqiong Zhao, Xia Wang, and Biao Tang. The impacts of spat ial–temporal heterogeneity of human-to-human contacts on the extinction probability of infectious disea se from branching process model. Journal of Theoretical Biology, 579:111703, 2024

  4. [4]

    Heterogeneous network epid emics: real-time growth, variance and extinction of infection

    Frank Ball and Thomas House. Heterogeneous network epid emics: real-time growth, variance and extinction of infection. Journal of Mathematical Biology , 75(3):577–619, 2017

  5. [5]

    Multitype sir epidemics am ong a population partitioned into households with proportionate global mixing: F

    Frank Ball and Liam Critcher. Multitype sir epidemics am ong a population partitioned into households with proportionate global mixing: F. ball, l. critcher. Journal of Mathematical Biology , 92(5):77, 2026

  6. [6]

    Using multitype branching processes to quantify statistics of disease outbreaks in zoonotic epidemics

    Sarabjeet Singh, David J Schneider, and Christopher R My ers. Using multitype branching processes to quantify statistics of disease outbreaks in zoonotic epidemics. Physical Review E, 89(3):032702, 2014

  7. [7]

    Uncovering epid emiological dynamics in heterogeneous host populations using phylogenetic methods

    Tanja Stadler and Sebastian Bonhoeffer. Uncovering epid emiological dynamics in heterogeneous host populations using phylogenetic methods. Philosophical Transactions of the Royal Society B: Biologi cal Sciences , 368(1614): 20120198, 2013

  8. [8]

    Estim ating a binary character’s effect on speciation and extinction

    Wayne P Maddison, Peter E Midford, and Sarah P Otto. Estim ating a binary character’s effect on speciation and extinction. Systematic biology, 56(5):701–710, 2007

  9. [9]

    Unifying the epidemiological and evolutionary dyn amics of pathogens

    Bryan T Grenfell, Oliver G Pybus, Julia R Gog, James LN Woo d, Janet M Daly, Jenny A Mumford, and Edward C Holmes. Unifying the epidemiological and evolutionary dyn amics of pathogens. science, 303(5656):327–332, 2004

  10. [10]

    Simultaneous reconstruction of evolutionary history and epidemiological dynamics from vi ral sequences with the birth–death sir model

    Denise Kühnert, Tanja Stadler, Timothy G Vaughan, and A lexei J Drummond. Simultaneous reconstruction of evolutionary history and epidemiological dynamics from vi ral sequences with the birth–death sir model. Journal of the Royal Society Interface, 11(94):20131106, 2014

  11. [11]

    Phylody- namics of infectious disease epidemics

    Erik M Volz, Sergei L Kosakovsky Pond, Melissa J Ward, An drew J Leigh Brown, and Simon DW Frost. Phylody- namics of infectious disease epidemics. Genetics, 183(4):1421–1430, 2009

  12. [12]

    Viral phy lodynamics

    Erik M Volz, Katia Koelle, and Trevor Bedford. Viral phy lodynamics. PLoS computational biology, 9(3):e1002947, 2013

  13. [13]

    Drummond

    Tanja Stadler, Denise Kühnert, Sebastian Bonhoeffer, a nd Alexei J. Drummond. Birth-death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C vi rus (HCV). Proceedings of the National Academy of Sciences, 110(1):228–233, 2013. doi: 10.1073/pnas.1207965110. 25

  14. [14]

    Accounting for conta ct tracing in epidemiological birth-death models

    Anna Zhukova and Olivier Gascuel. Accounting for conta ct tracing in epidemiological birth-death models. PLoS Computational Biology, 21(5):e1012461, 2025

  15. [15]

    Edge-based co mpartmental modelling for infectious disease spread

    Joel C Miller, Anja C Slim, and Erik M Volz. Edge-based co mpartmental modelling for infectious disease spread. Journal of the Royal Society Interface , 9(70):890–906, 2012

  16. [16]

    Network-based analysis o f stochastic sir epidemic models with random and proportionate mixing

    Eben Kenah and James M Robins. Network-based analysis o f stochastic sir epidemic models with random and proportionate mixing. Journal of theoretical biology , 249(4):706–722, 2007

  17. [17]

    Kiss, Joel S

    István Z.. Kiss, Joel S.. Miller, and Péter Simon. Mathematics of epidemics on networks: from exact to approxi mate models. Springer International Publishing, 2017

  18. [18]

    Phylogenies from dynamic networks

    Cornelia Metzig, Oliver Ratmann, Daniela Bezemer, and Caroline Colijn. Phylogenies from dynamic networks. PLoS computational biology , 15(2):e1006761, 2019

  19. [19]

    Exact and approx imate formulas for contact tracing on random trees

    Augustine Okolie and Johannes Müller. Exact and approx imate formulas for contact tracing on random trees. Mathematical biosciences, 321:108320, 2020

  20. [20]

    Parameter estimation for contact tracing in graph-based models

    Augustine Okolie, Johannes Müller, and Mirjam Kretzschmar. Parameter estimation for contact tracing in graph-based models. Journal of the Royal Society Interface , 20(208):20230409, 2023

  21. [21]

    Contact tracing on stochastic graphs

    Augustine Okebunor Okolie. Contact tracing on stochastic graphs . PhD thesis, Technische Universität München,

  22. [22]

    URL https://mediatum.ub.tum.de/1661774

  23. [23]

    On incomplete sampling under birth–dea th models and connections to the sampling-based coalescent

    Tanja Stadler. On incomplete sampling under birth–dea th models and connections to the sampling-based coalescent . Journal of theoretical biology , 261(1):58–66, 2009

  24. [24]

    Contact tracing of covid-19 in karnataka, india: Superspreading and determinants of infectiousness and sym ptomatic infection

    Mohak Gupta, Giridara G Parameswaran, Manraj S Sra, Ris hika Mohanta, Devarsh Patel, Amulya Gupta, Bhavik Bansal, Vardhmaan Jain, Archisman Mazumder, Mehak Arora, et al. Contact tracing of covid-19 in karnataka, india: Superspreading and determinants of infectiousness and sym ptomatic infection. Plos one, 17(7):e0270789, 2022

  25. [25]

    Superspreading and the effect of individual variation on disease emergence

    James O Lloyd-Smith, Sebastian J Schreiber, P Ekkehard Kopp, and Wayne M Getz. Superspreading and the effect of individual variation on disease emergence. Nature, 438(7066):355–359, 2005

  26. [26]

    Kucharski, and Sebastian Funk

    Akira Endo, Centre for the Mathematical Modelling of In fectious Diseases COVID-19 Working Group, Sam Abbott, Adam J. Kucharski, and Sebastian Funk. Estimating the overdispersion in covid-19 transmission using outbreak sizes outside china. Wellcome Open Research, 5:67, 2020

  27. [27]

    Adam, Peng Wu, Jessica Y

    Dillon C. Adam, Peng Wu, Jessica Y . Wong, Eric H. Y . Lau, T im K. Tsang, Simon Cauchemez, Gabriel M. Leung, and Benjamin J. Cowling. Clustering and superspreading pot ential of sars-cov-2 infections in hong kong. Nature Medicine, 26(11):1714–1719, 2020

  28. [28]

    Modelling of reproduction number for covid-19 in india and high incide nce states

    Sakthivel Marimuthu, Melvin Joy, B Malavika, Ambily Nadaraj, Edwin Sam Asirvatham, and L Jeyaseelan. Modelling of reproduction number for covid-19 in india and high incide nce states. Clinical Epidemiology and Global Health , 9:57–61, 2021

  29. [29]

    Predictions for covid-19 outbreak in in dia using epidemiological models

    Rajesh Ranjan. Predictions for covid-19 outbreak in in dia using epidemiological models. medRxiv, 2020. doi: 10.1101/2020.04.02.20051466

  30. [30]

    Assessment of lockdown effect in some states and overall india: A predictive mathematical study o n covid-19 outbreak

    Tridip Sardar, Sk Shahid Nadim, Sourav Rana, and Joydev Chattopadhyay. Assessment of lockdown effect in some states and overall india: A predictive mathematical study o n covid-19 outbreak. Chaos, Solitons & Fractals , 139: 110078, 2020. 26