pith. machine review for the scientific record. sign in

arxiv: 2604.17926 · v1 · submitted 2026-04-20 · 🧬 q-bio.PE

Recognition: unknown

Information on hidden birth events restores identifiability in phylodynamic inference

Authors on Pith no claims yet

Pith reviewed 2026-05-10 03:40 UTC · model grok-4.3

classification 🧬 q-bio.PE
keywords phylodynamicsbirth-death processesparameter identifiabilityphylogenetic treeshidden birth eventsmutation accumulationtime-dependent rates
0
0 comments X

The pith

Information on hidden birth events restores identifiability to time-dependent birth-death models from phylogenetic trees.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Birth-death process parameters cannot be inferred uniquely from phylogenetic trees alone because infinitely many rate combinations produce the same tree distributions, especially when rates vary over time. The paper demonstrates that adding data on hidden birth events along the branches of the reconstructed tree recovers unique parameter identification even in the most general time-dependent cases. This holds whether sampling occurs at one time point or continuously at a time-dependent rate. When mutations accumulate at birth under either of two specified models, sequence data itself encodes the locations of those hidden births, making all parameters identifiable whenever mutation-at-birth mechanisms are plausible.

Core claim

Parameters of birth-death processes cannot be inferred uniquely from phylogenetic trees alone since infinitely many parameter combinations yield the same distribution of phylogenetic trees. When additional information on hidden birth events along branches of the reconstructed tree is available, parameter identifiability is recovered even for the most general cases of time-dependent rates. This holds both for models in which individuals are sampled at a single point in time or through time at a time-dependent rate. Moreover, when mutations occur at birth under two different models for the accumulation of mutations at a birth event, information about hidden birth events is available in the se

What carries the argument

Information on hidden birth events along branches of the reconstructed phylogenetic tree, which becomes extractable from sequences under models where mutations accumulate at birth.

If this is right

  • All parameters of time-dependent birth-death models become uniquely identifiable with information on hidden birth events.
  • Sequence data supplies the hidden birth information under either of the two mutation-accumulation-at-birth models.
  • The identifiability result applies equally to single-time sampling and to sampling through time at varying rates.
  • Phylodynamic inference is identifiable in any setting where mutation accumulation at birth events such as transmission or cell division is a plausible mechanism.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Real-world applications such as epidemiological modeling could use inferred hidden transmissions from genetic sequences to tighten estimates of time-varying transmission rates.
  • The same logic might extend to other branching processes where hidden events can be marked by sequence changes, such as cell divisions in tumor phylogenies.
  • Simulations that inject known hidden births into trees with time-varying rates would provide direct numerical checks on the recovered parameter uniqueness.

Load-bearing premise

That information on hidden birth events is either directly available or can be extracted from sequences under the two specific models for mutation accumulation at birth.

What would settle it

A demonstration that two different sets of time-dependent birth and death rate functions still produce identical distributions over phylogenetic trees even when the locations of all hidden birth events are known would falsify the claim.

read the original abstract

The parameters of many classes of birth-death processes cannot be inferred uniquely from phylogenetic trees: infinitely many parameter combinations yield the same distribution of phylogenetic trees. Here, we show that parameter identifiability can be recovered even for the most general cases of time-dependent rates when additional information on hidden birth events along branches of the reconstructed tree is available. This holds both for models in which individuals are sampled at a single point in time or through time at a time-dependent rate. Moreover, we prove that when mutations occur at birth - assuming two different models for the accumulation of mutations at a birth event - then information about hidden birth events is available in the sequences and thus all parameters of time-dependent birth-death models become identifiable. Thus, phylodynamic inference is identifiable whenever evolutionary models with mutation accumulation at birth (such as at speciation, transmission, or cell division) are plausible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that information on hidden birth events along branches of reconstructed phylogenetic trees restores identifiability of parameters in birth-death processes, even for the most general time-dependent rate functions. This holds for both single-time and time-dependent sampling. It further proves that, under two models of mutation accumulation at birth events, sequence data encode the hidden-birth counts, rendering all parameters of time-dependent birth-death models identifiable whenever mutation-at-birth mechanisms are plausible.

Significance. If the proofs hold, the result is significant for phylodynamic inference: it directly resolves non-identifiability that arises for arbitrary time-dependent rates in standard birth-death likelihoods by augmenting the data with hidden-birth multiplicity. The explicit treatment of mutation-at-birth models (speciation, transmission, cell division) supplies a concrete, sequence-based route to identifiability without requiring external data. The provision of mathematical proofs for both the hidden-birth restoration and the mutation-at-birth extraction is a clear strength.

major comments (2)
  1. [Abstract] Abstract and introduction: the claim that identifiability is restored 'even for the most general cases of time-dependent rates' is load-bearing for the central result, yet the manuscript does not state the regularity conditions (e.g., continuity, measurability, or local boundedness) required on λ(t) and μ(t) for the mapping from rates to the distribution of trees augmented by per-branch hidden-birth counts to be injective. Without these, the integral equations relating observed branching times and hidden-birth multiplicities may admit distinct solutions.
  2. [Mutation-at-birth models] Mutation-at-birth section: the two models for mutation accumulation at birth are invoked to extract hidden-birth information from sequences, but the manuscript must explicitly derive the likelihood contribution of the observed sequence patterns under each model and show that the hidden-birth multiplicity is recoverable without additional constraints on the mutation process.
minor comments (1)
  1. Ensure that all rate functions and sampling intensities are defined with consistent notation across the main text and appendices.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and valuable suggestions that help clarify and strengthen our results. We address each major comment point by point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract and introduction: the claim that identifiability is restored 'even for the most general cases of time-dependent rates' is load-bearing for the central result, yet the manuscript does not state the regularity conditions (e.g., continuity, measurability, or local boundedness) required on λ(t) and μ(t) for the mapping from rates to the distribution of trees augmented by per-branch hidden-birth counts to be injective. Without these, the integral equations relating observed branching times and hidden-birth multiplicities may admit distinct solutions.

    Authors: We agree that explicitly stating regularity conditions improves rigor. Our proofs rely on λ(t) and μ(t) being positive, continuous, and integrable over finite intervals, which ensures the Volterra-type integral equations admit unique solutions and the mapping is injective. In the revision we will add these assumptions to the abstract, introduction, and theorem statements, noting they are standard and mild for birth-death models. revision: yes

  2. Referee: [Mutation-at-birth models] Mutation-at-birth section: the two models for mutation accumulation at birth are invoked to extract hidden-birth information from sequences, but the manuscript must explicitly derive the likelihood contribution of the observed sequence patterns under each model and show that the hidden-birth multiplicity is recoverable without additional constraints on the mutation process.

    Authors: We will expand the mutation-at-birth section with explicit likelihood derivations. For the Poisson-at-birth model the observed mutations on a branch follow a compound Poisson distribution; the likelihood is a sum over possible hidden-birth counts k, and we show the MLE for k is unique given the per-birth mutation rate. For the single-mutation-per-birth model the count equals the observed mutations exactly. Both derivations recover the multiplicity directly from the model without further constraints. revision: yes

Circularity Check

0 steps flagged

Direct proof of identifiability from birth-death likelihood definitions; no circular reductions

full rationale

The paper derives identifiability by showing that the augmented likelihood (reconstructed tree plus per-branch hidden-birth counts) is injective in the rate functions λ(t), μ(t) for both single-time and time-dependent sampling. This follows from the standard integral equations of the birth-death process and the explicit mapping from rates to branching times and hidden-event multiplicities; the proof does not fit any parameter to data, rename a known result, or invoke a self-citation whose content is itself unverified. The two mutation-at-birth models are likewise treated by direct substitution into the sequence likelihood, again without circularity. No load-bearing step reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard properties of birth-death processes and sampling models in phylodynamics; no free parameters are introduced because the result is an identifiability proof rather than a fit. No new entities are postulated.

axioms (2)
  • domain assumption Birth-death processes with time-dependent rates generate phylogenetic trees whose distribution is determined by the birth and death rate functions and the sampling process.
    Invoked throughout the abstract as the modeling framework whose identifiability is analyzed.
  • domain assumption Hidden birth events along branches affect the observed tree distribution in a way that can be conditioned upon to restore uniqueness.
    Core to the first part of the claim; standard but load-bearing for the identifiability restoration.

pith-pipeline@v0.9.0 · 5443 in / 1481 out tokens · 34385 ms · 2026-05-10T03:40:21.870252+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 34 canonical work pages

  1. [1]

    Unexpectedly many extinct hominins , volume =

    Bokma, Folmer and van den Brink, Valentijn and Stadler, Tanja , urldate =. Unexpectedly many extinct hominins , volume =. doi:10.1111/j.1558-5646.2012.01660.x , pages =

  2. [2]

    The reconstructed evolutionary process , volume =

    Nee, Sean and May, Robert Mccredie and Harvey, Paul H , date =. The reconstructed evolutionary process , volume =

  3. [3]

    The Annals of Mathematical Statistics , author =

    Kendall, David G. , urldate =. On the Generalized "Birth-and-Death" Process , volume =. doi:10.1214/aoms/1177730285 , pages =

  4. [4]

    Bouckaert, Remco and Vaughan, Timothy G. and Barido-Sottani, Joëlle and Duchêne, Sebastián and Fourment, Mathieu and Gavryushkina, Alexandra and Heled, Joseph and Jones, Graham and Kühnert, Denise and Maio, Nicola De and Matschiner, Michael and Mendes, Fábio K. and Müller, Nicola F. and Ogilvie, Huw A. and Plessis, Louis du and Popinga, Alex and Rambaut, ...

  5. [5]

    Model-Based Inference of Punctuated Molecular Evolution , volume =

    Manceau, Marc and Marin, Julie and Morlon, Hélène and Lambert, Amaury , editor =. Model-Based Inference of Punctuated Molecular Evolution , volume =. doi:10.1093/molbev/msaa144 , pages =

  6. [6]

    Die Grundlagen der Volterraschen Theorie des Kampfes ums Dasein in wahrscheinlichkeitstheoretischer Behandlung , volume =

    Feller, Willy , urldate =. Die Grundlagen der Volterraschen Theorie des Kampfes ums Dasein in wahrscheinlichkeitstheoretischer Behandlung , volume =. doi:10.1007/BF01602932 , pages =

  7. [7]

    Sampling-through-time in birth–death trees , volume =

    Stadler, Tanja , urldate =. Sampling-through-time in birth–death trees , volume =. doi:10.1016/j.jtbi.2010.09.010 , pages =

  8. [8]

    How Can We Improve Accuracy of Macroevolutionary Rate Estimates? , volume =

    Stadler, Tanja , urldate =. How Can We Improve Accuracy of Macroevolutionary Rate Estimates? , volume =. doi:10.1093/sysbio/sys073 , pages =

  9. [9]

    doi:10.1093/sysbio/syab049 , pages =

    Unifying Phylogenetic Birth–Death Models in Epidemiology and Macroevolution , volume =. doi:10.1093/sysbio/syab049 , pages =

  10. [10]

    On incomplete sampling under birth–death models and connections to the sampling-based coalescent , volume =

    Stadler, Tanja , urldate =. On incomplete sampling under birth–death models and connections to the sampling-based coalescent , volume =. doi:10.1016/j.jtbi.2009.07.018 , pages =

  11. [11]

    Evolutionary trees from

    Felsenstein, Joseph , urldate =. Evolutionary trees from. doi:10.1007/BF01734359 , shorttitle =

  12. [12]

    , urldate =

    Louca, Stilianos and Pennell, Matthew W. , urldate =. Extant timetrees are consistent with a myriad of diversification histories , volume =. doi:10.1038/s41586-020-2176-1 , pages =

  13. [13]

    A class of identifiable phylogenetic birth–death models , volume =

    Legried, Brandon and Terhorst, Jonathan , urldate =. A class of identifiable phylogenetic birth–death models , volume =. doi:10.1073/pnas.2119513119 , pages =

  14. [14]

    , urldate =

    Gavryushkina, Alexandra and Welch, David and Stadler, Tanja and Drummond, Alexei J. , urldate =. Bayesian Inference of Sampled Ancestor Trees for Epidemiology and Fossil Calibration , volume =. doi:10.1371/journal.pcbi.1003919 , pages =

  15. [15]

    , urldate =

    Stolz, Ugnė and Stadler, Tanja and Vaughan, Timothy G. , urldate =. Integrating Transmission Dynamics and Pathogen Evolution Through a Bayesian Approach , rights =. doi:10.1101/2024.04.15.589468 , publisher =

  16. [16]

    Fundamental Identifiability Limits in Molecular Epidemiology , volume =

    Louca, Stilianos and. Fundamental Identifiability Limits in Molecular Epidemiology , volume =. doi:10.1093/molbev/msab149 , pages =

  17. [17]

    The Fossilized Birth–Death Model Is Identifiable , volume =

    Truman, Kate and Vaughan, Timothy G and Gavryushkin, Alex and Gavryushkina, Alexandra , urldate =. The Fossilized Birth–Death Model Is Identifiable , volume =. doi:10.1093/sysbio/syae058 , pages =

  18. [18]

    and Plotkin, Joshua B

    Morlon, Hélène and Parsons, Todd L. and Plotkin, Joshua B. , urldate =. Reconciling molecular phylogenies with the fossil record , volume =. doi:10.1073/pnas.1102543108 , pages =

  19. [19]

    and Pennell, Matthew W

    Louca, Stilianos and Shih, Patrick M. and Pennell, Matthew W. and Fischer, Woodward W. and Parfrey, Laura Wegener and Doebeli, Michael , urldate =. Bacterial diversification through geological time , volume =. doi:10.1038/s41559-018-0625-0 , pages =

  20. [20]

    and Carter Jr., Charles W

    Douglas, Jordan and Bouckaert, Remco and Harris, Simon C. and Carter Jr., Charles W. and Wills, Peter R. , urldate =. Evolution is coupled with branching across many granularities of life , volume =. doi:10.1098/rspb.2025.0182 , pages =

  21. [21]

    Punctuated equilibria: an alternative to phyletic gradualism , volume =

    Eldredge, Niles and Gould, Stephen Jay , date =. Punctuated equilibria: an alternative to phyletic gradualism , volume =

  22. [22]

    and Meade, Andrew and Venditti, Chris and Greenhill, Simon J

    Atkinson, Quentin D. and Meade, Andrew and Venditti, Chris and Greenhill, Simon J. and Pagel, Mark , urldate =. Languages Evolve in Punctuational Bursts , volume =. doi:10.1126/science.1149683 , pages =

  23. [23]

    M.: Distances of probability measures and random variables

    Yakowitz, Sidney J. and Spragins, John D. , urldate =. On the Identifiability of Finite Mixtures , volume =. doi:10.1214/aoms/1177698520 , pages =

  24. [24]

    Identifiability of Finite Mixtures , volume =

    Teicher, Henry , urldate =. Identifiability of Finite Mixtures , volume =

  25. [25]

    (1961) Statistical Methods in Markov Chains.Ann

    Teicher, Henry , urldate =. Identifiability of Mixtures , volume =. doi:10.1214/aoms/1177705155 , pages =

  26. [26]

    A new sufficient condition for identifiability of countably infinite mixtures , volume =

    Yang, Lei and Wu, Xianyi , urldate =. A new sufficient condition for identifiability of countably infinite mixtures , volume =. doi:10.1007/s00184-013-0444-x , pages =

  27. [27]

    Patil, G. P. and Bildikar, Sheela , urldate =. Identifiability of countable mixtures of discrete probability distributions using methods of infinite matrices , volume =. doi:10.1017/S030500410004010X , pages =

  28. [28]

    Tallis, G. M. , urldate =. The identifiability of mixtures of distributions , volume =. doi:10.2307/3212008 , pages =

  29. [29]

    Maximum Likelihood Estimation of a Compound Poisson Process , volume =

    Simar, Léopold , urldate =. Maximum Likelihood Estimation of a Compound Poisson Process , volume =

  30. [30]

    (1961) Statistical Methods in Markov Chains.Ann

    Teicher, Henry , urldate =. On the Mixture of Distributions , volume =. doi:10.1214/aoms/1177705987 , pages =

  31. [31]

    Phylodynamics of Somatic Evolution: A Likelihood-Based Approach for Cellular Reproduction , volume =

    Dieselhorst, Tobias and Berg, Johannes , urldate =. Phylodynamics of Somatic Evolution: A Likelihood-Based Approach for Cellular Reproduction , volume =. doi:10.1093/molbev/msag002 , shorttitle =

  32. [32]

    The large-sample distribution of the likelihood ratio for testing composite hypotheses

    Neyman, J. , urldate =. On a New Class of "Contagious" Distributions, Applicable in Entomology and Bacteriology , volume =. doi:10.1214/aoms/1177732245 , abstract =

  33. [33]

    Baele, Guy and Ji, Xiang and Hassler, Gabriel W. and. doi:10.1038/s41592-025-02751-x , pages =

  34. [34]

    and Heath, Tracy A

    Höhna, Sebastian and Landis, Michael J. and Heath, Tracy A. and Boussau, Bastien and Lartillot, Nicolas and Moore, Brian R. and Huelsenbeck, John P. and Ronquist, Fredrik , urldate =. doi:10.1093/sysbio/syw021 , shorttitle =

  35. [35]

    and Pybus, O

    Stadler, T. and Pybus, O. G. and Stumpf, M. P. H. , urldate =. Phylodynamics for cell biologists , volume =. doi:10.1126/science.aah6266 , pages =

  36. [36]

    and Barido-Sottani, Joëlle and Bošková, Veronika and Huisman, Jana and Pečerska, Jūlija , editora =

    Stadler, Tanja and Magnus, Carsten and Vaughan, Timothy G. and Barido-Sottani, Joëlle and Bošková, Veronika and Huisman, Jana and Pečerska, Jūlija , editora =. Decoding Genomes: From Sequences to Phylodynamics , url =

  37. [37]

    and Drummond, Alexei J

    Kühnert, Denise and Stadler, Tanja and Vaughan, Timothy G. and Drummond, Alexei J. , urldate =. Phylodynamics with Migration: A Computational Framework to Quantify Population Structure from Genomic Data , volume =. doi:10.1093/molbev/msw064 , shorttitle =

  38. [38]

    Bateson, Gregory , urldate =. 199. Culture Contact and Schismogenesis , volume =. doi:10.2307/2789408 , pages =

  39. [39]

    Linguistic Diversification and Rates of Change: Insights From a Diverse Sample of Sociolinguistic Studies , volume =

    Mansfield, John , urldate =. Linguistic Diversification and Rates of Change: Insights From a Diverse Sample of Sociolinguistic Studies , volume =. doi:10.1111/lnc3.70022 , shorttitle =