pith. sign in

arxiv: 2405.17032 · v4 · pith:RGQRWD2Hnew · submitted 2024-05-27 · 🧬 q-bio.QM · math.PR· q-bio.PE· stat.AP

Exact phylodynamic likelihood via structured Markov genealogy processes

Pith reviewed 2026-05-24 01:31 UTC · model grok-4.3

classification 🧬 q-bio.QM math.PRq-bio.PEstat.AP
keywords phylodynamicsgenealogy processesMarkov modelslikelihood inferencefilter equationscoalescentbirth-death processessimulation-based inference
0
0 comments X

The pith

Each Markovian population model induces a unique genealogy process allowing exact likelihood computation via a model-specific filter equation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that any Markovian population model creates a distinct stochastic process on genealogies. From this process, exact likelihoods for observed genealogies can be computed using a filter equation whose form is dictated by the model itself. This unifies and extends existing methods like the coalescent and birth-death processes. The approach preserves simulation-based inference capabilities while enabling more models to be fit exactly. A sympathetic reader would care because it broadens the scope of rigorous statistical inference in phylodynamics without losing computational tractability.

Core claim

We show that each member of a broad class of Markovian population models induces a unique stochastic process on the space of genealogies. We construct this genealogy process and derive exact expressions for the likelihood of an observed genealogy in terms of a filter equation, the structure of which is completely determined by the population model. We show that existing phylodynamic methods based on the coalescent and linear birth-death processes are special cases. We derive some properties of filter equations and describe a class of algorithms that can be used to numerically solve them. Importantly, because these algorithms rely only on simulation of the population model, they retain the 0.

What carries the argument

The structured Markov genealogy process induced by the population model, together with the associated filter equation whose structure is fixed by that model.

If this is right

  • Coalescent and linear birth-death phylodynamic methods are recovered as special cases of the general framework.
  • A class of algorithms exists to numerically solve the filter equations using only simulations of the population model.
  • The plug-and-play property required for simulation-based inference is retained.
  • Likelihood-based phylodynamic inference extends to a much wider class of Markovian population models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The filter approach could integrate with models having time-varying or spatially structured parameters that were previously limited to approximation.
  • Numerical solvers based on the filter might support faster inference pipelines in real-time epidemiological applications.
  • Links to filtering techniques in other stochastic modeling domains could yield new efficient computational strategies.

Load-bearing premise

The population model must be Markovian so that the induced genealogy process is uniquely determined and the filter equation is well-defined.

What would settle it

A counterexample showing that two distinct Markovian population models induce the same genealogy process, or that the likelihood of some genealogy cannot be recovered from the claimed filter equation.

read the original abstract

We show that each member of a broad class of Markovian population models induces a unique stochastic process on the space of genealogies. We construct this genealogy process and derive exact expressions for the likelihood of an observed genealogy in terms of a filter equation, the structure of which is completely determined by the population model. We show that existing phylodynamic methods based on the coalescent and linear birth-death processes are special cases. We derive some properties of filter equations and describe a class of algorithms that can be used to numerically solve them. Importantly, because these algorithms rely only on simulation of the population model, they retain the plug-and-play property upon which simulation-based inference depends. Our results open the door to statistically efficient likelihood-based phylodynamic inference for a much wider class of models than is currently possible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that each member of a broad class of Markovian population models induces a unique stochastic process on the space of genealogies. It constructs this process and derives exact likelihood expressions for an observed genealogy in the form of a filter equation whose structure is fixed by the underlying population model. The construction recovers the coalescent and linear birth-death processes as special cases, derives properties of the filter equations, and supplies a class of simulation-based algorithms that preserve the plug-and-play property.

Significance. If the derivations are correct, the result supplies a unifying exact-likelihood framework for phylodynamic inference that extends well beyond the two model classes currently in routine use. The retention of simulation-only numerical methods is a concrete strength that directly supports the plug-and-play inference pipeline emphasized in the abstract.

major comments (2)
  1. [§3] §3 (filter-equation derivation): the uniqueness claim for the induced genealogy process is stated to follow directly from the Markov property of the population model, but the argument that the transition kernel on the genealogy space remains Markovian (and therefore yields a well-defined filter) is not shown explicitly; an explicit verification that the generator preserves the required conditional-independence structure would make the central step load-bearing rather than asserted.
  2. [Eq. (filter equation for linear birth-death)] Eq. (filter equation for linear birth-death): the reduction to the known birth-death likelihood is asserted as a special case, yet the precise mapping of the population-model rates into the filter coefficients is not displayed; without this step it is impossible to confirm that no auxiliary assumptions are introduced when recovering the classical result.
minor comments (2)
  1. The abstract states that 'some properties of filter equations' are derived; a short dedicated subsection listing these properties (e.g., linearity, invariance under time reversal) would improve readability.
  2. Notation for the state space of the genealogy process is introduced without an explicit comparison table to the state spaces used in the coalescent and birth-death literature; adding such a table would clarify the generality claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive report and positive assessment of the work's potential. We address each major comment below and will revise the manuscript to incorporate the requested clarifications.

read point-by-point responses
  1. Referee: [§3] §3 (filter-equation derivation): the uniqueness claim for the induced genealogy process is stated to follow directly from the Markov property of the population model, but the argument that the transition kernel on the genealogy space remains Markovian (and therefore yields a well-defined filter) is not shown explicitly; an explicit verification that the generator preserves the required conditional-independence structure would make the central step load-bearing rather than asserted.

    Authors: We agree that an explicit verification of the Markov property would make the central construction more transparent. In the revised manuscript we will add a dedicated lemma that computes the action of the generator on cylinder functions of the genealogy process and verifies that the required conditional-independence structure is preserved, thereby confirming that the induced process is Markovian and that the filter equation is rigorously well-defined. revision: yes

  2. Referee: [Eq. (filter equation for linear birth-death)] Eq. (filter equation for linear birth-death): the reduction to the known birth-death likelihood is asserted as a special case, yet the precise mapping of the population-model rates into the filter coefficients is not displayed; without this step it is impossible to confirm that no auxiliary assumptions are introduced when recovering the classical result.

    Authors: We accept that the explicit substitution of rates is needed for verification. The revised version will contain a short subsection that substitutes the linear birth-death birth and death rates into the general filter coefficients, shows term-by-term cancellation, and recovers the classical birth-death likelihood exactly, with no auxiliary assumptions introduced. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper derives exact likelihood expressions for genealogies induced by Markovian population models via a filter equation whose structure is fixed by the model definition. The uniqueness of the induced genealogy process follows directly from the Markov property as the explicit scope of the result, recovering known coalescent and birth-death cases as special instances without any reduction of outputs to fitted inputs, self-citations, or imported uniqueness theorems. No load-bearing step equates a prediction or derived quantity to its own inputs by construction; the construction is self-contained in the standard theory of Markov processes.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The construction rests on the domain assumption that every Markovian population model induces a unique genealogy process; no free parameters or invented entities are mentioned in the abstract.

axioms (1)
  • domain assumption Markovian population models induce a unique stochastic process on the space of genealogies
    Invoked in the first sentence of the abstract as the starting point for constructing the genealogy process and filter equation.

pith-pipeline@v0.9.0 · 5671 in / 1231 out tokens · 21363 ms · 2026-05-24T01:31:25.707202+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

  1. [1]

    (2024) Phylodynamics

    Alizon, S. (2024) Phylodynamics. In G. Didier & S. Guindon (eds.), Models and Methods for Biological Evolution, pp. 259--282. Hoboken, New Jersey: Wiley. https://doi.org/10.1002/9781394284252.ch11 DOI: 10.1002/9781394284252.ch11

  2. [2]

    S., Maskell, S., Gordon, N., & Clapp, T

    Arulampalam, M. S., Maskell, S., Gordon, N., & Clapp, T. (2002) A tutorial on particle filters for online nonlinear, non- Gaussian Bayesian tracking. IEEE Trans Signal Process 50:174--188. https://doi.org/10.1109/78.978374 DOI: 10.1109/78.978374

  3. [3]

    G., & Stadler, T

    Barido-Sottani, J., Vaughan, T. G., & Stadler, T. (2020) A multitype birth-death model for Bayesian inference of lineage-specific birth and death rates. Systematic Biology 69:973--986. https://doi.org/10.1093/sysbio/syaa016 DOI: 10.1093/sysbio/syaa016

  4. [4]

    (2014) Inference of epidemiological dynamics based on simulated phylogenies using birth-death and coalescent models

    Boskova, V., Bonhoeffer, S., & Stadler, T. (2014) Inference of epidemiological dynamics based on simulated phylogenies using birth-death and coalescent models. PLoS Computational Biology 10:e1003913. https://doi.org/10.1371/journal.pcbi.1003913 DOI: 10.1371/journal.pcbi.1003913

  5. [5]

    & Ionides, E

    Bret \'o , C. & Ionides, E. L. (2011) Compound Markov counting processes and their applications to modeling infinitesimally over-dispersed systems. Stochastic Processes and their Applications 121:2571--2591. https://doi.org/10.1016/j.spa.2011.07.005 DOI: 10.1016/j.spa.2011.07.005

  6. [6]

    (eds.) (2001) Sequential Monte Carlo Methods in Practice

    Doucet, A., de Freitas, N., & Gordon, N. (eds.) (2001) Sequential Monte Carlo Methods in Practice . New York: Springer. https://doi.org/10.1007/978-1-4757-3437-9 DOI: 10.1007/978-1-4757-3437-9

  7. [7]

    J., Rambaut, A., Shapiro, B., & Pybus, O

    Drummond, A. J., Rambaut, A., Shapiro, B., & Pybus, O. G. (2005) Bayesian coalescent inference of past population dynamics from molecular sequences. Molecular Biology and Evolution 22:1185--1192. https://doi.org/10.1093/molbev/msi103 DOI: 10.1093/molbev/msi103

  8. [8]

    (2004) Inferring Phylogenies

    Felsenstein, J. (2004) Inferring Phylogenies. Sunderland, Mass.: Sinauer

  9. [9]

    & Schwenkler, G

    Giesecke, K. & Schwenkler, G. (2018) Filtered likelihood for point processes. Journal of Econometrics 204:33--53. https://doi.org/10.1016/j.jeconom.2017.11.011 DOI: 10.1016/j.jeconom.2017.11.011

  10. [10]

    Gillespie, D. T. (1977) Exact stochastic simulation of coupled chemical reactions. Journal of Physical Chemistry 81:2340--2361. https://doi.org/10.1021/j100540a008 DOI: 10.1021/j100540a008

  11. [11]

    T., Pybus, O

    Grenfell, B. T., Pybus, O. G., Gog, J. R., Wood, J. L. N., Daly, J. M., Mumford, J. A., & Holmes, E. C. (2004) Unifying the epidemiological and evolutionary dynamics of pathogens. Science 303:327--332. https://doi.org/10.1126/science.1090727 DOI: 10.1126/science.1090727

  12. [12]

    Griffiths, R. C. & Tavar \'e , S. (1994) Sampling theory for neutral alleles in a varying environment. Philosophical Transactions of the Royal Society of London, Series B 344:403--410. https://doi.org/10.1098/rstb.1994.0079 DOI: 10.1098/rstb.1994.0079

  13. [13]

    L., & King, A

    He, D., Ionides, E. L., & King, A. A. (2010) Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. Journal of the Royal Society, Interface 7:271--283. https://doi.org/10.1098/rsif.2009.0151 DOI: 10.1098/rsif.2009.0151

  14. [14]

    Higham, D. J. (2008) Modeling and simulating chemical reactions. SIAM Review 50:347--368. https://doi.org/10.1137/060666457 DOI: 10.1137/060666457

  15. [15]

    L., Breto, C., Park, J., Smith, R

    Ionides, E. L., Breto, C., Park, J., Smith, R. A., & King, A. A. (2017) Monte carlo profile confidence intervals for dynamic systems. Journal of the Royal Society, Interface 14:20170126. https://doi.org/10.1098/rsif.2017.0126 DOI: 10.1098/rsif.2017.0126

  16. [16]

    A., Lin, Q., & Ionides, E

    King, A. A., Lin, Q., & Ionides, E. L. (2020) The sampled Moran genealogy process. arXiv 2002.11184. https://doi.org/10.48550/ARXIV.2002.11184 DOI: 10.48550/ARXIV.2002.11184

  17. [17]

    A., Lin, Q., & Ionides, E

    King, A. A., Lin, Q., & Ionides, E. L. (2022) Markov genealogy processes. Theoretical Population Biology 143:77--91. https://doi.org/10.1016/j.tpb.2021.11.003 DOI: 10.1016/j.tpb.2021.11.003

  18. [18]

    Kingman, J. F. C. (1982 a ) The coalescent. Stochastic Processes and their Applications 13:235--248. https://doi.org/10.1016/0304-4149(82)90011-4 DOI: 10.1016/0304-4149(82)90011-4

  19. [19]

    Kingman, J. F. C. (1982 b ) On the genealogy of large populations. Journal of Applied Probability 19:27--43. https://doi.org/10.2307/3213548 DOI: 10.2307/3213548

  20. [20]

    H., Koch, G., & Marchetti, F

    Kliemann, W. H., Koch, G., & Marchetti, F. (1990) On the unnormalized solution of the filtering problem with counting process observations. IEEE Transactions on Information Theory 36:1415--1425. https://doi.org/10.1109/18.59936 DOI: 10.1109/18.59936

  21. [21]

    E., G \" u nthard, H

    Leventhal, G. E., G \" u nthard, H. F., Bonhoeffer, S., & Stadler, T. (2014) Using an epidemiological model for phylogenetic inference reveals density dependence in HIV transmission. Molecular Biology and Evolution 31:6--17. https://doi.org/10.1093/molbev/mst172 DOI: 10.1093/molbev/mst172

  22. [22]

    B., & Pennell, M

    MacPherson, A., Louca, S., McLaughlin, A., Joy, J. B., & Pennell, M. W. (2021) Unifying phylogenetic birth-death models in epidemiology and macroevolution. Systematic Biology 71:172--189. https://doi.org/10.1093/sysbio/syab049 DOI: 10.1093/sysbio/syab049

  23. [23]

    (2000) Ancestral processes in population genetics---the coalescent

    M \" o hle, M. (2000) Ancestral processes in population genetics---the coalescent. Journal of Theoretical Biology 204:629--638. https://doi.org/10.1006/jtbi.2000.2032 DOI: 10.1006/jtbi.2000.2032

  24. [24]

    (2018) Impact of the tree prior on estimating clock rates during epidemic outbreaks

    M \" o ller, S., du Plessis, L., & Stadler, T. (2018) Impact of the tree prior on estimating clock rates during epidemic outbreaks. Proceedings of the National Academy of Sciences 115:4200--4205. https://doi.org/10.1073/pnas.1713314115 DOI: 10.1073/pnas.1713314115

  25. [25]

    Moran, P. A. P. (1958) Random processes in genetics. Mathematical Proceedings of the Cambridge Philosophical Society 54:60--71. https://doi.org/10.1017/s0305004100033193 DOI: 10.1017/s0305004100033193

  26. [26]

    (1990) The coalescent and the genealogical process in geographically structured population

    Notohara, M. (1990) The coalescent and the genealogical process in geographically structured population. Journal of Mathematical Biology 29. https://doi.org/10.1007/bf00173909 DOI: 10.1007/bf00173909

  27. [27]

    (1978) The asymptotic behaviour of maximum likelihood estimators for stationary point processes

    Ogata, Y. (1978) The asymptotic behaviour of maximum likelihood estimators for stationary point processes. Annals of the Institute of Statistical Mathematics 30:243--261. https://doi.org/10.1007/bf02480216 DOI: 10.1007/bf02480216

  28. [28]

    Puri, M. L. & Tuan, P. D. (1986) Maximum likelihood estimation for stationary point processes. Proceedings of the National Academy of Sciences 83:541--545. https://doi.org/10.1073/pnas.83.3.541 DOI: 10.1073/pnas.83.3.541

  29. [29]

    A., Ratmann, O., & Koelle, K

    Rasmussen, D. A., Ratmann, O., & Koelle, K. (2011) Inference for nonlinear epidemiological models using genealogies and time series. PLoS Computational Biology 7:e1002136. https://doi.org/10.1371/journal.pcbi.1002136 DOI: 10.1371/journal.pcbi.1002136

  30. [30]

    A., Volz, E

    Rasmussen, D. A., Volz, E. M., & Koelle, K. (2014) Phylodynamic inference for structured epidemiological models. PLoS Computational Biology 10:e1003570. https://doi.org/10.1371/journal.pcbi.1003570 DOI: 10.1371/journal.pcbi.1003570

  31. [31]

    Seidel, S., Stadler, T., & Vaughan, T. G. (2024) Estimating pathogen spread using structured coalescent and birth-death models: A quantitative comparison. Epidemics 49:100795. https://doi.org/10.1016/j.epidem.2024.100795 DOI: 10.1016/j.epidem.2024.100795

  32. [32]

    (2010) Sampling-through-time in birth-death trees

    Stadler, T. (2010) Sampling-through-time in birth-death trees. Journal of Theoretical Biology 267:396--404. https://doi.org/10.1016/j.jtbi.2010.09.010 DOI: 10.1016/j.jtbi.2010.09.010

  33. [33]

    (1988) The coalescent in two partially isolated diffusion populations

    Takahata, N. (1988) The coalescent in two partially isolated diffusion populations. Genetical Research 52:213--222. https://doi.org/10.1017/s0016672300027683 DOI: 10.1017/s0016672300027683

  34. [34]

    Vaughan, T. G. & Stadler, T. (2025) Bayesian phylodynamic inference of multitype population trajectories using genomic data. Molecular Biology and Evolution 42:msaf130. https://doi.org/10.1093/molbev/msaf130 DOI: 10.1093/molbev/msaf130

  35. [35]

    Volz, E. M. (2012) Complex population dynamics and the coalescent under neutrality. Genetics 190:187--201. https://doi.org/10.1534/genetics.111.134627 DOI: 10.1534/genetics.111.134627

  36. [36]

    Volz, E. M. & Frost, S. D. W. (2014) Sampling through time and phylodynamic inference with coalescent and birth-death models. Journal of the Royal Society, Interface 11:20140945. https://doi.org/10.1098/rsif.2014.0945 DOI: 10.1098/rsif.2014.0945

  37. [37]

    M., Kosakovsky Pond, S

    Volz, E. M., Kosakovsky Pond, S. L., Ward, M. J., Leigh Brown, A. J., & Frost, S. D. W. (2009) Phylodynamics of infectious disease epidemics. Genetics 183:1421--1430. https://doi.org/10.1534/genetics.109.106021 DOI: 10.1534/genetics.109.106021

  38. [38]

    Volz, E. M. & Siveroni, I. (2018) Bayesian phylodynamic inference with complex models. PLoS Computational Biology 14:e1006546. https://doi.org/10.1371/journal.pcbi.1006546 DOI: 10.1371/journal.pcbi.1006546

  39. [39]

    (2009) Coalescent Theory: An Introduction

    Wakeley, J. (2009) Coalescent Theory: An Introduction . New York: W. H. Freeman

  40. [40]

    Wills, A. G. & Sch \" o n, T. B. (2023) Sequential Monte Carlo : a unified review. Annual Review of Control, Robotics, and Autonomous Systems 6. https://doi.org/10.1146/annurev-control-042920-015119 DOI: 10.1146/annurev-control-042920-015119