pith. sign in

arxiv: 2605.20692 · v1 · pith:U3YCTHUSnew · submitted 2026-05-20 · 📊 stat.ME · q-bio.PE· q-bio.QM· stat.AP

Inferring infectiousness: a joint model of the within-host viral kinetics of SARS-CoV-2

Pith reviewed 2026-05-21 02:55 UTC · model grok-4.3

classification 📊 stat.ME q-bio.PEq-bio.QMstat.AP
keywords SARS-CoV-2viral kineticsinfectiousnessBayesian joint modelwithin-host dynamicsPCRviral cultureisolation risk
0
0 comments X

The pith

A joint Bayesian model infers infectious virus shedding trajectories from PCR data alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a Bayesian joint model using longitudinal data on PCR, viral culture, and symptom onset from five cohorts covering about 2,000 SARS-CoV-2 infections. This model connects the different proxies to a shared underlying process of viral kinetics. By doing so, it infers the full trajectory of infectious virus shedding even for individuals who provide only PCR measurements. A sympathetic reader would care because infectious virus shedding is the most direct correlate of infectiousness, allowing derived estimates of transmission risk, isolation duration, and real-time updates that single proxies cannot provide on their own.

Core claim

Using data from five prospective cohorts with longitudinal measurements on multiple proxies for approximately 2,000 infections, the authors construct a Bayesian joint model of within-host viral kinetics. Modeling the joint distribution permits inference of the trajectory of infectious virus shedding for individuals who contribute only PCR data, along with population-level quantities such as the probability and expected duration of ongoing infectiousness as a function of time since diagnosis stratified by variant, vaccination status, and infection history, the residual risk of releasing an individual from isolation, and personalized real-time estimates of infectiousness updated sequentially.

What carries the argument

Bayesian joint model of within-host viral kinetics that links PCR viral load, viral culture positivity, and symptom onset to a common underlying infectious virus shedding trajectory.

If this is right

  • Enables inference of infectious virus shedding trajectories using only PCR data for new individuals.
  • Yields population-level estimates of the probability and duration of ongoing infectiousness stratified by variant, vaccination status, and infection history.
  • Supports calculation of residual risk when deciding to release an individual from isolation based on time since diagnosis.
  • Allows sequential updating of personalized infectiousness estimates as new test results arrive.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same joint-modeling strategy could be applied to other pathogens where direct infectiousness measures are hard to obtain but several proxies are routinely collected.
  • Public health guidelines for isolation periods might shift from fixed rules toward model-based, test-updated risk estimates.
  • Cohort studies would gain value by routinely collecting multiple proxies rather than relying on one measure alone.

Load-bearing premise

The relationships among PCR, viral culture, and symptom data capture the underlying infectious virus shedding trajectory accurately enough without major bias from the sampled cohorts or unmeasured factors.

What would settle it

Apply the fitted model to predict infectious shedding in a new validation cohort that has both PCR and viral culture data, then check whether the inferred shedding trajectories match the observed culture results.

Figures

Figures reproduced from arXiv: 2605.20692 by Ajit Lalvani, Christopher B. Boyer, Jakob Jonnerby, Marc Lipsitch, Seran Hakki, Stephen M. Kissler.

Figure 1
Figure 1. Figure 1: Overview of the joint model for within-host viral kinetics. [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Population-level viral kinetics. (A) Population-mean trajectories for viral RNA (blue) and infectious virus (red) with 50% (dark shading) and 80% (light shading) credible intervals, computed from 200 posterior draws. Trajectories are clamped at the assay-specific limits of detection (dashed horizontal lines). LFD positivity probability and daily symptom onset hazard are shown as shaded tiles above the traj… view at source ↗
Figure 7
Figure 7. Figure 7: Posterior predictive probability of remaining culture-positive, LFD-positive, and [PITH_FULL_IMAGE:figures/full_fig_p050_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Policy-relevant quantities from the joint posterior predictive distribution. [PITH_FULL_IMAGE:figures/full_fig_p052_8.png] view at source ↗
read the original abstract

During an infectious disease outbreak, providing accurate answers to policy questions about transmission requires a detailed model of the natural history of infectiousness. Unfortunately, direct measures of infectiousness are generally unavailable. Instead, we often rely on indirect proxies, such as viral load measured by PCR or antigen tests, viral culture to detect replication-competent virus, or symptom onset, each of which reflects different aspects of viral dynamics or host response. However, these proxies vary in terms of the ease of collection, scalability, and their relationship to viral shedding and therefore underlying infectiousness. Here, we use data from five prospective, densely sampled cohorts with longitudinal data on multiple proxies of viral shedding for approximately 2,000 infections to develop a Bayesian joint model for the within-host viral kinetics of SARS-CoV-2 infection. Modeling the joint distribution allows us to infer the trajectory of infectious virus shedding -- the most direct correlate of infectiousness -- for individuals who contribute only PCR data, and to compute derived quantities that are inaccessible from any single proxy alone. These include the population-level probability and expected duration of ongoing infectiousness as a function of time since diagnosis, stratified by variant, vaccination status, and infection history; the residual risk of releasing an individual from isolation; and personalized, real-time estimates of infectiousness that are sequentially updated as new test results become available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a Bayesian joint model for within-host viral kinetics of SARS-CoV-2 using longitudinal data on multiple proxies (PCR, viral culture, symptoms) from five prospective cohorts with approximately 2,000 infections. The central claim is that modeling the joint distribution enables inference of infectious virus shedding trajectories—the most direct correlate of infectiousness—for individuals providing only PCR data, along with derived quantities such as population-level probability and expected duration of ongoing infectiousness (stratified by variant, vaccination status, and infection history), residual isolation risk, and real-time personalized infectiousness estimates updated with new test results.

Significance. If the joint model recovers the conditional distribution of infectious shedding given PCR with low bias, the work would provide a practical advance for estimating transmission risk from the most scalable proxy (PCR), supporting better-informed isolation policies and real-time risk assessment during outbreaks. The use of multiple densely sampled cohorts and a Bayesian framework to handle heterogeneous data types represents a strength in scale and flexibility.

major comments (2)
  1. [Methods / Results] The central claim that the joint model can reliably infer infectious shedding trajectories from PCR data alone (Abstract and §3) requires that the learned correlations among proxies generalize without major bias. However, the manuscript provides no external validation set in which culture data are observed but withheld during fitting to directly test the accuracy of the inferred shedding conditional on PCR.
  2. [Results] §4 (or equivalent results section on derived quantities): The reported population-level probabilities of ongoing infectiousness and residual isolation risk rest on the assumption that the five cohorts are representative; no sensitivity analyses are shown that vary cohort weights, correlation priors, or account for potential unmeasured factors such as variant-specific immune effects or differences in sampling frequency.
minor comments (2)
  1. [Abstract] The abstract states 'approximately 2,000 infections' but the methods should report the exact sample size, breakdown by cohort, and proportion of observations with culture versus PCR-only data for transparency.
  2. [Methods] Notation for the latent shedding process and its link to observed proxies should be clarified with an explicit diagram or equation reference early in the methods to aid readers in following the joint distribution specification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive review of our manuscript. We address each of the major comments below and outline the revisions we will make to address them.

read point-by-point responses
  1. Referee: [Methods / Results] The central claim that the joint model can reliably infer infectious shedding trajectories from PCR data alone (Abstract and §3) requires that the learned correlations among proxies generalize without major bias. However, the manuscript provides no external validation set in which culture data are observed but withheld during fitting to directly test the accuracy of the inferred shedding conditional on PCR.

    Authors: We agree that an explicit external validation would strengthen confidence in the model's ability to infer infectious shedding from PCR alone. Although the current manuscript relies on the joint modeling framework and posterior predictive checks for validation, we will add a hold-out validation analysis in the revised version. Specifically, we will withhold culture data for a random subset of infections during model fitting and evaluate the accuracy of the predicted infectious virus trajectories conditional on the remaining proxies. revision: yes

  2. Referee: [Results] §4 (or equivalent results section on derived quantities): The reported population-level probabilities of ongoing infectiousness and residual isolation risk rest on the assumption that the five cohorts are representative; no sensitivity analyses are shown that vary cohort weights, correlation priors, or account for potential unmeasured factors such as variant-specific immune effects or differences in sampling frequency.

    Authors: We acknowledge the importance of demonstrating robustness to cohort composition and modeling assumptions. In the revised manuscript, we will include sensitivity analyses that (i) reweight the contributions of the five cohorts, (ii) vary the priors on the correlation parameters between proxies, and (iii) incorporate additional random effects or stratifications to explore potential unmeasured factors such as variant-specific immune responses and sampling frequency differences. We will report how these affect the key derived quantities. revision: yes

Circularity Check

0 steps flagged

Bayesian joint model fitted to external cohort data shows no circular derivation

full rationale

The paper constructs a Bayesian joint model for within-host viral kinetics by fitting to longitudinal multi-proxy data from five independent prospective cohorts totaling ~2000 infections. The central step is learning the joint distribution of observed proxies (PCR, culture, symptoms) to impute the latent infectious shedding trajectory for PCR-only individuals. This is a standard conditional inference procedure whose outputs are determined by the fitted parameters and the data, not by any self-definitional loop, fitted-input-as-prediction, or self-citation chain that reduces the result to its own inputs. No equations are presented that equate a claimed prediction to a quantity defined from the same fit; the model assumptions and cohort representativeness are external benchmarks that can be checked independently. The derivation is therefore self-contained against the supplied data.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Model depends on fitting parameters to observational cohort data and domain assumptions about how proxies relate to underlying infectiousness; no invented entities apparent from abstract.

free parameters (1)
  • parameters of within-host viral kinetics trajectories
    Bayesian model requires parameters to describe viral load dynamics and relationships among proxies, fitted to the longitudinal data.
axioms (1)
  • domain assumption Each proxy reflects different aspects of viral dynamics or host response
    Explicitly stated in abstract as basis for using multiple proxies in joint model.

pith-pipeline@v0.9.0 · 5808 in / 1197 out tokens · 44210 ms · 2026-05-21T02:55:29.240951+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 2 internal anchors

  1. [1]

    Watson, J. M. et al. Characterisation of a wild-type influenza (A/H1N1) virus strain as an experimental challenge agent in humans. Virology journal 12, 13 (2015)

  2. [2]

    Marks, M. et al. Transmission of COVID-19 in 282 clusters in Catalonia, Spain: a cohort study. The Lancet Infectious Diseases 21, 629–636 (2021)

  3. [3]

    & Eckerle, I

    Puhach, O., Meyer, B. & Eckerle, I. SARS-CoV-2 viral load and shedding kinetics. Nature Reviews Microbiology 21, 147–161. doi:10.1038/s41579-022-00822-w (2023)

  4. [4]

    Cevik, M. et al. SARS-CoV-2, SARS-CoV, and MERS-CoV viral load dynamics, dura- tion of viral shedding, and infectiousness: a systematic review and meta-analysis. The Lancet Microbe 2, e13–e22. doi:10.1016/S2666-5247(20)30172-5 (2021). 59

  5. [5]

    J., Parker, R

    Mina, M. J., Parker, R. & Larremore, D. B. Covid-19 testing: One size does not fit all. Science 371, 126–127. doi:10.1126/science.abe9187 (2021)

  6. [6]

    Binnicker, M. J. Challenges and controversies to testing for COVID-19. Journal of Clinical Microbiology 58, e01695–20. doi:10.1128/JCM.01695-20 (2020)

  7. [7]

    He, X. et al. Temporal dynamics in viral shedding and transmissibility of COVID-19. Nature Medicine 26, 672–675. doi:10.1038/s41591-020-0869-5 (2020)

  8. [8]

    Jones, T. C. et al. Estimating infectiousness throughout SARS-CoV-2 infection course. Science 373, eabi5273. doi:10.1126/science.abi5273 (2021)

  9. [9]

    Quilty, B. J. et al. Quarantine and testing strategies in contact tracing for SARS-CoV- 2: a modelling study. The Lancet Public Health 6, e175–e183. doi: 10 . 1016 / S2468 - 2667(20)30308-X (2021)

  10. [10]

    Larremore, D. B. et al. Test sensitivity is secondary to frequency and turnaround time for COVID-19 screening. Science Advances 7, eabd5393. doi:10.1126/sciadv.abd5393 (2021)

  11. [11]

    W¨ olfel, R. et al. Virological assessment of hospitalized patients with COVID-2019. Nature 581, 465–469. doi:10.1038/s41586-020-2196-x (2020)

  12. [12]

    Bullard, J. et al. Predicting infectious severe acute respiratory syndrome coronavirus 2 from diagnostic samples. Clinical Infectious Diseases 71, 2663–2666. doi: 10.1093/ cid/ciaa638 (2020)

  13. [13]

    Puhach, O. et al. Infectious viral load in unvaccinated and vaccinated individuals in- fected with ancestral, Delta, or Omicron SARS-CoV-2.Nature Medicine 28, 1491–1500. doi:10.1038/s41591-022-01816-0 (2022)

  14. [14]

    Toptan, T. et al. Evaluation of a SARS-CoV-2 rapid antigen test: Potential to help reduce community spread? Journal of Clinical Virology 135, 104713 (2021)

  15. [15]

    G., Mavrouli, M., Tsoplou, P., Dioikitopoulou, K

    Routsias, J. G., Mavrouli, M., Tsoplou, P., Dioikitopoulou, K. & Tsakris, A. Diag- nostic performance of rapid antigen tests (RATs) for SARS-CoV-2 and their efficacy in monitoring the infectiousness of COVID-19 patients. Scientific reports 11, 22863. doi:10.1038/s41598-021-02197-z (2021). 60

  16. [16]

    Joung, Y. et al. Rapid and accurate on-site immunodiagnostics of highly contagious se- vere acute respiratory syndrome coronavirus 2 using portable surface-enhanced Raman scattering-lateral flow assay reader. Acs Sensors 7, 3470–3480 (2022)

  17. [17]

    Pickering, S. et al. Comparative performance of SARS-CoV-2 lateral flow antigen tests and association with detection of infectious virus in clinical specimens: a single-centre laboratory evaluation study. The Lancet Microbe 2, e461–e471. doi: 10.1016/S2666- 5247(21)00143-9 (2021)

  18. [18]

    Hakki, S. et al. Onset and window of SARS-CoV-2 infectiousness and temporal corre- lation with symptom onset: a prospective, longitudinal, community cohort study. The Lancet Respiratory Medicine 10, 1061–1073. doi: 10.1016/S2213- 2600(22)00226- 0 (2022)

  19. [19]

    Eyre, D. W. et al. Performance of antigen lateral flow devices in the UK during the alpha, delta, and omicron waves of the SARS-CoV-2 pandemic: a diagnostic and ob- servational study. The Lancet Infectious Diseases 23, 922–932. doi: 10.1016/S1473- 3099(23)00129-9 (2023)

  20. [20]

    Fraser, C., Riley, S., Anderson, R. M. & Ferguson, N. M. Factors that make an infectious disease outbreak controllable. Proceedings of the National Academy of Sciences 101, 6146–6151 (2004)

  21. [21]

    Oran, D. P. & Topol, E. J. Prevalence of asymptomatic SARS-CoV-2 infection: a narra- tive review. Annals of Internal Medicine 173, 362–367. doi:10.7326/M20-3012 (2020)

  22. [22]

    Buitrago-Garcia, D. et al. Occurrence and transmission potential of asymptomatic and presymptomatic SARS-CoV-2 infections: A living systematic review and meta-analysis. PLOS Medicine 17, e1003346. doi:10.1371/journal.pmed.1003346 (2020)

  23. [23]

    Perelson, A. S. Modelling viral and immune system dynamics. Nature Reviews Im- munology 2, 28–36. doi:10.1038/nri700 (2002)

  24. [24]

    A., Hayden, F

    Baccam, P., Beauchemin, C., Macken, C. A., Hayden, F. G. & Perelson, A. S. Kinetics of influenza A virus infection in humans. Journal of Virology 80, 7590–7599. doi: 10. 1128/JVI.01623-05 (2006). 61

  25. [25]

    Hay, J. A. et al. Quantifying the impact of immune history and variant on SARS-CoV-2 viral kinetics and infection rebound: A retrospective cohort study. eLife 11, e81849. doi:10.7554/eLife.81849 (2022)

  26. [26]

    Generating random correlation matrices based on vines and extended onion method , journal =

    Lewandowski, D., Kurowicka, D. & Joe, H. Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis 100, 1989–2001. doi:10.1016/j.jmva.2009.04.008 (2009)

  27. [27]

    Rubin, D. B. Inference and missing data. Biometrika 63, 581–592 (1976)

  28. [28]

    Little, R. J. & Rubin, D. B. Statistical analysis with missing data (2019)

  29. [29]

    Gelman, A. et al. Bayesian Data Analysis (2013)

  30. [30]

    Kissler, S. M. et al. Viral dynamics of SARS-CoV-2 variants in vaccinated and un- vaccinated persons. New England Journal of Medicine 385, 2489–2491. doi:10.1056/ NEJMc2102507 (2021)

  31. [31]

    Ke, R. et al. Daily longitudinal sampling of SARS-CoV-2 infection reveals substantial heterogeneity in infectiousness. Nature Microbiology 7, 640–652. doi:10.1038/s41564- 022-01105-z (2022)

  32. [32]

    A Conceptual Introduction to Hamiltonian Monte Carlo

    Betancourt, M. A conceptual introduction to Hamiltonian Monte Carlo. arXiv preprint arXiv:1701.02434 (2017)

  33. [33]

    Singanayagam, A. et al. Community transmission and viral load kinetics of the SARS- CoV-2 delta (B.1.617.2) variant in vaccinated and unvaccinated individuals in the UK: a prospective, longitudinal, cohort study. The Lancet Infectious Diseases 22, 183–195. doi:10.1016/S1473-3099(21)00648-4 (2022)

  34. [34]

    Ranoa, D. R. E. et al. Mitigation of SARS-CoV-2 transmission at a large public uni- versity. Nature Communications 13, 3207. doi:10.1038/s41467-022-30833-3 (2022)

  35. [35]

    Russell, T. W. et al. Combined analyses of within-host SARS-CoV-2 viral kinetics and information on past exposures to the virus in a human cohort identifies intrinsic differences of Omicron and Delta variants. PLoS Biology 22, e3002463. doi:10.1371/ journal.pbio.3002463 (2024). 62

  36. [36]

    Killingley, B. et al. Safety, tolerability and viral kinetics during SARS-CoV-2 human challenge in young adults. Nature Medicine 28, 1031–1041. doi:10.1038/s41591-022- 01780-9 (2022)

  37. [37]

    Carpenter, B. et al. Stan: A probabilistic programming language. Journal of Statistical Software 76, 1–32. doi:10.18637/jss.v076.i01 (2017)

  38. [38]

    Hoffman, M. D. & Gelman, A. The No-U-Turn Sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research 15, 1593–1623 (2014)

  39. [39]

    & Breck, S

    Gabry, J., ˇCeˇ snovar, R., Johnson, A. & Breck, S. cmdstanr: R Interface to CmdStan 2024

  40. [40]

    Papaspiliopoulos, O., Roberts, G. O. & Sk¨ old, M. A general framework for the parametriza- tion of hierarchical models.Statistical Science 22, 59–73. doi:10.1214/088342307000000014 (2007)

  41. [41]

    Neal, R. M. Slice sampling. The Annals of Statistics 31, 705–767. doi:10.1214/aos/ 1056562461 (2003)

  42. [42]

    Diagnosing Suboptimal Cotangent Disintegrations in Hamiltonian Monte Carlo

    Betancourt, M. Diagnosing suboptimal cotangent disintegrations in Hamiltonian Monte Carlo. arXiv preprint arXiv:1604.00695 (2016)

  43. [43]

    Rank-normalization, folding, and localization: An improved R for assessing convergence of MCMC

    Vehtari, A., Gelman, A., Simpson, D., Carpenter, B. & B¨ urkner, P.-C. Rank-normalization, folding, and localization: An improved ˆR for assessing convergence of MCMC. Bayesian Analysis 16, 667–718. doi:10.1214/20-BA1221 (2021)

  44. [44]

    & Gelman, A

    Gabry, J., Simpson, D., Vehtari, A., Betancourt, M. & Gelman, A. Visualization in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in So- ciety) 182, 389–402. doi:10.1111/rssa.12378 (2019)

  45. [45]

    & Gabry, J

    Vehtari, A., Simpson, D., Gelman, A., Yao, Y. & Gabry, J. Pareto smoothed importance sampling. Journal of Machine Learning Research 25, 1–58 (2024)

  46. [46]

    & Gabry, J

    Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave- one-out cross-validation and WAIC. Statistics and Computing 27, 1413–1432. doi:10. 1007/s11222-016-9696-4 (2017). 63

  47. [47]

    Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory

    Watanabe, S. Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research 11, 3571–3594 (2010)

  48. [48]

    & Gelman, A

    Talts, S., Betancourt, M., Simpson, D., Vehtari, A. & Gelman, A. Validating Bayesian inference algorithms with simulation-based calibration.arXiv preprint arXiv:1804.06788 (2018)

  49. [49]

    & Andrieu, C

    Doucet, A., Godsill, S. & Andrieu, C. On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and computing 10, 197–208 (2000)

  50. [50]

    Hakki, S. et al. Symptoms predicting the onset and duration of SARS-CoV-2 infectious- ness: a community cohort study. International Journal of Infectious Diseases, 108471 (2026). 64 Supplementary Materials Inferring infectiousness: a joint model of the within-host viral kinetics of SARS-CoV-2 May 21, 2026 This supplement contains prior predictive checks (...