pith. sign in

arxiv: 2510.01484 · v5 · submitted 2025-10-01 · 🧬 q-bio.PE · q-bio.QM

Bayesian Re-Analysis of the Phylogenetic Topology of Early SARS-CoV-2 Case Sequences

Pith reviewed 2026-05-18 11:14 UTC · model grok-4.3

classification 🧬 q-bio.PE q-bio.QM
keywords SARS-CoV-2phylogenetic analysisBayesian reasoningvirus introductionsmolecular phylogenypandemic originssingle introductiontwo introductions
0
0 comments X

The pith

Correcting a fundamental error in Bayesian reasoning reverses the conclusion on early SARS-CoV-2 introductions, favoring one over two.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper re-analyzes the Bayesian phylogenetic study of early SARS-CoV-2 case sequences from a 2022 work. The original analysis concluded that two successful introductions to humans were more probable than one. After identifying and fixing a basic mistake in the application of Bayesian reasoning, the same data and methods yield a higher likelihood for a single introduction. A sympathetic reader would care because the number of introductions shapes understanding of the virus's initial jump from animals and its early human spread. The work is a direct replication that applies the correction to the published results without new data.

Core claim

After correcting a fundamental error in Bayesian reasoning the results in that paper give larger likelihood for a single introduction than for two.

What carries the argument

The corrected Bayesian posterior comparison of phylogenetic topologies supporting one versus two successful introductions.

If this is right

  • The data now assign higher probability to a single successful human introduction than to two.
  • This reverses the direction of the original conclusion about the number of introductions.
  • Models of early pandemic spread should incorporate the updated relative likelihoods from the corrected analysis.
  • The phylogenetic evidence alone does not support two introductions as the more likely scenario.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar Bayesian setups in other pathogen origin studies could be checked for the same reasoning step.
  • Collecting additional early case sequences might allow a direct test of whether the corrected likelihoods hold with more data.
  • The result suggests that conclusions about introduction counts can shift with precise handling of probability updates even when the underlying tree topologies remain unchanged.

Load-bearing premise

The assumption that the identified error is the only material flaw in the original Bayesian setup and that the re-calculation applies the correction without introducing new modeling choices or data exclusions that themselves affect the single-versus-two comparison.

What would settle it

Re-running the original Bayesian calculation on the early SARS-CoV-2 sequence data with the corrected treatment of priors or likelihoods and directly comparing the resulting probabilities for one versus two introductions.

read the original abstract

A much-cited 2022 paper by Pekar et al. claimed that Bayesian analysis of the molecular phylogeny of early SARS-CoV-2 cases indicated that it was more likely that two successful introductions to humans had occurred than that just one had. Here I show that after correcting a fundamental error in Bayesian reasoning the results in that paper give larger likelihood for a single introduction than for two.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript re-analyzes the Bayesian phylogenetic comparison of single versus two successful introductions of early SARS-CoV-2 from Pekar et al. (2022). It identifies a fundamental error in the application of Bayesian reasoning and asserts that, once corrected while holding the original data and model fixed, the likelihood favors a single introduction over two.

Significance. If the claimed reversal is shown to arise solely from the identified Bayesian correction without new modeling choices, the result would be significant for the interpretation of SARS-CoV-2 origins and for the correct use of Bayesian model comparison in phylogenetic studies of viral emergence. The work draws attention to a potential systematic issue in how posterior probabilities are compared across introduction scenarios.

major comments (2)
  1. The central claim requires an explicit side-by-side derivation or numerical comparison showing that the corrected likelihood ratio favors one introduction. The abstract states the reversal but the manuscript provides no equation, table, or step-by-step recalculation of the relevant posteriors from the Pekar et al. model.
  2. To attribute the reversal cleanly to the Bayesian error alone, the manuscript must verify that the re-analysis uses identical sequence data, tree topologies, priors on introduction times, and model structure as the 2022 work. Any implicit data exclusions or altered sampling would undermine the claim that the result follows solely from the correction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our re-analysis of Pekar et al. (2022). We address each major comment below, agreeing that additional explicit material will improve clarity, and indicate the corresponding revisions.

read point-by-point responses
  1. Referee: The central claim requires an explicit side-by-side derivation or numerical comparison showing that the corrected likelihood ratio favors one introduction. The abstract states the reversal but the manuscript provides no equation, table, or step-by-step recalculation of the relevant posteriors from the Pekar et al. model.

    Authors: We agree that an explicit side-by-side derivation and numerical comparison would strengthen the presentation. The revised manuscript will include a new section with the step-by-step recalculation of the posterior probabilities under the original and corrected Bayesian reasoning, together with a table reporting the likelihood ratios and model probabilities for the single-introduction versus two-introduction scenarios. revision: yes

  2. Referee: To attribute the reversal cleanly to the Bayesian error alone, the manuscript must verify that the re-analysis uses identical sequence data, tree topologies, priors on introduction times, and model structure as the 2022 work. Any implicit data exclusions or altered sampling would undermine the claim that the result follows solely from the correction.

    Authors: The re-analysis uses precisely the sequence data, tree topologies, introduction-time priors, and model structure reported in Pekar et al. (2022), with the sole modification being the correction to the Bayesian model-comparison step. The revised methods section will add explicit cross-references to the original supplementary materials and tables to document this identity and rule out any data or sampling changes. revision: yes

Circularity Check

0 steps flagged

No circularity: re-analysis applies external correction to cited prior work

full rationale

The manuscript re-uses phylogenetic data, tree topologies, and model structure from the externally cited Pekar et al. 2022 paper and applies an independent correction to a claimed Bayesian reasoning error. The central result (reversed likelihood favoring single introduction) is presented as following from that correction without introducing new fitted parameters, self-defined quantities, or load-bearing self-citations. No derivation step reduces by construction to inputs defined within the present paper; the work is self-contained against the external benchmark of the 2022 analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The re-analysis rests on the phylogenetic topology and sequence data from Pekar et al. together with standard Bayesian updating rules; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • standard math Standard rules of Bayesian probability apply to the comparison of introduction scenarios
    The paper invokes correct application of Bayes' theorem to reverse the original conclusion.

pith-pipeline@v0.9.0 · 5584 in / 1161 out tokens · 36687 ms · 2026-05-18T11:14:46.110373+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.