Modelling multi-cancer screening data to infer on natural history of disease: when can valid, identifiable and precise inference be obtained?
Pith reviewed 2026-06-28 13:13 UTC · model grok-4.3
The pith
Disaggregating multistate models to individual cancer stages increases reliance on external assumptions when fitting multicancer screening data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Based only on MCED trial data, the 5-state model is as robust as the 3-state model, showing slight improvements to convergence and identifiability while maintaining precision for overall MST; in contrast, the 9-state model shows worsened convergence and identifiability, and a significant reduction in the precision of overall MST estimates.
What carries the argument
Multistate models of cancer progression applied to longitudinal screening data, with increasing numbers of states (3, 5, or 9), estimated via Bayesian Markov chain Monte Carlo and assessed via chain convergence, profile likelihood, and estimate precision.
If this is right
- Disaggregating natural history models by individual cancer stages increases reliance on external data or assumptions.
- Hierarchical models improve convergence, identifiability, and precision across the disaggregated models.
- Informative prior models can restore performance but shift estimates toward the chosen prior values.
- Explicit identifiability assessments and checks on the influence of external assumptions are needed to support inference for MCED screening evaluations.
Where Pith is reading between the lines
- Real MCED trial analyses may need to supplement screening data with independent sources such as population incidence rates to support stage-specific natural history estimates.
- Policy questions that do not require stage detail could be answered adequately with the simpler 3-state or 5-state models.
- Direct application of the same simulation design to published MCED trial data would provide a direct test of whether the identifiability patterns observed here appear in practice.
Load-bearing premise
The simulated longitudinal MCED screening datasets accurately capture the key unobserved onset and progression features of real trial data so that the observed differences in convergence, identifiability, and precision generalize beyond the simulations.
What would settle it
If actual multicancer screening trial data allow 9-state models to converge reliably and produce precise overall MST estimates without external priors or assumptions, while 5-state models fail to do so, the simulation results would not hold.
Figures
read the original abstract
Background: Multistate models (MSMs) applied to screening data can characterise the natural history of cancer and predict "stage-shifts" from screening. However, inferring parameters like mean sojourn time (MST) is challenging as disease onset is inherently unobserved in these data. This is even more challenging when characterising heterogeneity between cancer types in multicancer early detection (MCED) trial data. Methods: We utilised simulated longitudinal MCED screening datasets to evaluate the inferential bounds of MSMs under increasing clinical disaggregation: a 3-state (overall MST), 5-state (early/late stage), and 9-state (stages I-IV) model. Bayesian estimation was performed via Markov chain Monte Carlo. Robustness was assessed through chain convergence, parameter identifiability (via profile likelihood), and precision of estimates. We also explored hierarchical models and the use of informative priors to improve identifiability. Results: Based only on MCED trial data, many cancer types exhibited inferential challenges. Generally, the 5-state model was as robust as the 3-state model, showing slight improvements to convergence and identifiability while maintaining precision for overall MST. In contrast, the 9-state model showed worsened convergence and identifiability, and a significant reduction in the precision of overall MST estimates. Hierarchical models successfully improved performance, as have informative prior models but the latter introduced bias towards the prior values. Conclusions: While disaggregating natural history models by individual cancer stages is desirable for policy, these higher-dimensional models show a greater reliance on external data/assumptions. We recommend explicit identifiability assessments and assessments of the influence of external data/assumptions to support inference for MCED screening evaluations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper conducts a simulation study of multistate models fitted via MCMC to longitudinal MCED screening data, comparing inferential performance across a 3-state model (overall MST), 5-state model (early/late), and 9-state model (stages I-IV). It reports that the 9-state model exhibits poorer chain convergence, weaker profile-likelihood identifiability, and reduced precision for overall MST relative to the lower-dimensional models; hierarchical random-effects structures improve convergence and identifiability while informative priors improve precision at the cost of bias. The central conclusion is that greater clinical disaggregation increases reliance on external data or assumptions.
Significance. If the simulation design reproduces the key unobserved-onset and stage-progression bottlenecks present in real MCED trials, the results supply concrete, quantitative guidance on the identifiability–complexity trade-off that is directly relevant to ongoing MCED evaluations. The explicit use of profile likelihood alongside MCMC diagnostics and the demonstration that hierarchical models can partially restore performance are methodological strengths that could inform future trial analyses.
major comments (2)
- [Methods] Methods (simulation design): The data-generating process is not reported as having been calibrated or cross-validated against empirical observables from existing screening studies (e.g., stage-shift distributions or published sojourn-time bounds). Because the central claim—that 9-state models inherently exhibit greater external-data dependence—rests on the simulations reproducing the identifiability bottlenecks of real MCED data, this omission is load-bearing for the generalization.
- [Results] Results (profile-likelihood and precision comparisons): The manuscript does not report the exact number of profile points evaluated, the parameter ranges explored, or the quantitative threshold used to declare “non-identifiability” for the 9-state model; without these details it is difficult to judge whether the reported degradation in the 9-state model is robust to reasonable changes in the simulation design.
minor comments (1)
- [Abstract] The abstract states that “many cancer types exhibited inferential challenges” but does not indicate how many cancer types were simulated or whether results are pooled or reported per type; adding this information would improve clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which highlight important aspects of simulation design and reporting. We respond to each major comment below and indicate planned revisions.
read point-by-point responses
-
Referee: [Methods] Methods (simulation design): The data-generating process is not reported as having been calibrated or cross-validated against empirical observables from existing screening studies (e.g., stage-shift distributions or published sojourn-time bounds). Because the central claim—that 9-state models inherently exhibit greater external-data dependence—rests on the simulations reproducing the identifiability bottlenecks of real MCED data, this omission is load-bearing for the generalization.
Authors: We agree that formal calibration against empirical observables from screening studies would strengthen claims about generalization to real MCED trials. Our simulation parameters were selected from ranges reported in the published literature on sojourn times and stage distributions to capture the key structural features of unobserved onset and stage progression; however, no explicit cross-validation step was performed. In revision we will expand the Methods section to document the literature sources and parameter ranges used, and add a dedicated limitations paragraph discussing the implications for external validity. We believe the qualitative finding that higher-dimensional models increase reliance on external information remains robust under the simulated conditions that reproduce the core identifiability bottlenecks, but we accept the referee’s point that calibration would improve the manuscript. revision: partial
-
Referee: [Results] Results (profile-likelihood and precision comparisons): The manuscript does not report the exact number of profile points evaluated, the parameter ranges explored, or the quantitative threshold used to declare “non-identifiability” for the 9-state model; without these details it is difficult to judge whether the reported degradation in the 9-state model is robust to reasonable changes in the simulation design.
Authors: We thank the referee for noting this reporting gap. The profile-likelihood analyses evaluated the likelihood on a grid while optimizing remaining parameters, but the precise grid size, explored ranges, and numerical criterion for declaring non-identifiability (e.g., flatness within a tolerance) were not stated. In the revised manuscript we will add these details to the Results section and Supplementary Material, including the number of profile points, the parameter ranges, and the threshold applied, together with a brief sensitivity note on grid density. revision: yes
Circularity Check
No circularity: simulation-based identifiability assessment is self-contained
full rationale
The paper generates simulated longitudinal MCED datasets and applies Bayesian MCMC estimation to 3/5/9-state multistate models, then evaluates convergence, profile likelihood identifiability, and precision directly on those generated data. No fitted parameter is renamed as a prediction, no equation reduces to another by construction, and no load-bearing premise rests on self-citation chains or imported uniqueness theorems. The conclusion that 9-state models exhibit greater external-data reliance follows from the external simulation metrics rather than from any internal redefinition of inputs as outputs. This is the standard non-circular structure of a simulation study assessing inferential limits.
Axiom & Free-Parameter Ledger
free parameters (1)
- mean sojourn time
axioms (1)
- domain assumption Multistate model structure accurately represents unobserved cancer onset and progression in screening data
Reference graph
Works this paper leans on
-
[1]
stage-shift
Introduction The modelling of longitudinal cancer screening data using multistate models (MSMs) can characterise the natural history of cancer by describing progression through mutually exclusive states, including ‘no cancer’ (undetectable cancer), preclinical detectable cancer, and clinical diagnosis.1 A primary parameter of interest is the (preclinical)...
2009
-
[2]
no cancer
Methods 2.1 Generating simulated datasets: Dai’s model To simulate the NHS-Galleri trial data6, we used two published models.12,13 Dai’s microsimulation model12 served as our primary source because it supports the highest level of stage disaggregation (stages I to IV). The alternative model, MCEDsim13, allowing only disaggregation between early- and late-...
2024
-
[3]
Full results, including of test sensitivity, are in Supp C
Results 3.1 3-, 5- and 9-state MSM models Table 3 presents inference results for MST and stage-MST and the results of explorations of convergence (C), identifiability (I) and precision (P) using a representative dataset (Dai’s model, 50th percentile of predicted primary endpoint). Full results, including of test sensitivity, are in Supp C. Across all mode...
-
[4]
Discussion Given the inherent heterogeneity in tumor biology, not only in (multi-cancer) test sensitivity but also in preclinical cancer progression, it is essential that analyses of longitudinal cancer screening data reliably estimate mean sojourn time across cancer types and stages to determine the effect of screening on stage shift. In this study, we e...
2021
-
[5]
Cheung LC, Albert PS, Das S, Cook RJ Multistate models for the natural history of cancer progression. Br J Cancer. 2022 Jul 11;127(7):1279–1288. doi: 10.1038/s41416-022-01904-5
-
[6]
Geurts SME, Aarts AMWM, Verbeek ALM, Chen THH, Broeders MJM, Duffy SW. Quantifying the duration of the preclinical detectable phase in cancer screening: a systematic review. Epidemiol Health. 2022;44:e2022008. doi: 10.4178/epih.e2022008
-
[7]
A Scoping Review on Calibration Methods for Cancer Simulation Models
Zhang Y, Lipa N, Alagoz O. A Scoping Review on Calibration Methods for Cancer Simulation Models. Med Decis Making. 2025 Nov;45(8):965-975. doi: 10.1177/0272989X251353211. 15/26
-
[8]
Simplified models of screening for chronic disease: Estimation procedures from mass screening programmes
Day NE, Walter SD (1984). Simplified models of screening for chronic disease: Estimation procedures from mass screening programmes. Biometrics, 40(1), 1–14. PMID: 6733223
1984
-
[9]
Methods Med
Uhry et al., Multi-state Markov models in cancer screening evaluation: a brief review and case study, Stat. Methods Med. Res. 19 (2010) 463–486
2010
-
[10]
Neal RD, Johnson P, Clarke CA, Hamilton SA, Zhang N, Kumar H, Swanton C, Sasieni P. Cell-Free DNA–Based Multi-Cancer Early Detection Test in an Asymptomatic Screening Population (NHS-Galleri): Design of a Pragmatic, Prospective Randomised Controlled Trial. Cancers 2022, 14(19), 4818; https://doi.org/10.3390/cancers14194818
-
[11]
Sasieni P, Swanton C, Neal RD. The National Health Service-Galleri multi-cancer screening trial: explanation and justification of unique and important design issues. J Natl Cancer Inst. 2025 Aug 9:djaf218. https://doi.org/10.1093/jnci/djaf218
-
[12]
NHS-Galleri trial: Enriched enrolment approaches and sociodemographic characteristics of enrolled participants
Swanton C, Bachtiar V, Mathews C, et al. NHS-Galleri trial: Enriched enrolment approaches and sociodemographic characteristics of enrolled participants. Clinical Trials. 2025;22(2):227-
2025
-
[13]
https://doi.org/10.1177/174077452413024
-
[14]
Raue A, Kreutz C, Maiwald T, Bachmann J, Schilling M, Klingmüller U, Timmer J. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics. 2009 Aug 1;25(15):1923-9. doi: 10.1093/bioinformatics/btp358. Epub 2009 Jun 8. PMID: 19505944
-
[15]
Eisenberg MC, Jain HV. A confidence building exercise in data and identifiability: Modeling cancer chemotherapy as a case study. J Theor Biol. 2017 Oct 27;431:63-78. doi: 10.1016/j.jtbi.2017.07.018. Epub 2017 Jul 19. Erratum in: J Theor Biol. 2019 Mar 7;464:179. doi: 10.1016/j.jtbi.2018.12.042
-
[16]
Nonidentifiability in Model Calibration and Implications for Medical Decision Making
Alarid-Escudero F, MacLehose RF, Peralta Y, Kuntz KM, Enns EA. Nonidentifiability in Model Calibration and Implications for Medical Decision Making. Medical Decision Making. 2018;38(7):810-821. doi:10.1177/0272989X18792283 16/26
-
[17]
Dai JY, Zhang J, Braun JV, Simon N, Hubbell E, Zhang N, Clinical performance and utility: A microsimulation model to inform the design of screening trials for a multi-cancer early detection test. J Med Screen. 2024 Feb 2;31(3):140–149. doi: 10.1177/09691413241228041
-
[18]
Lange JM., Gogebakan KC, Gulati R, Etzioni R. Projecting the Impact of Multi-Cancer Early Detection on Late-Stage Incidence Using Multi-State Disease Modeling. Cancer Epidemiol Biomarkers Prev (2024) 33 (6): 830–837. https://doi.org/10.1158/1055-9965.EPI-23-1470
-
[19]
Klein, E. A., D. Richards, A. Cohn, M. Tummala, R. Lapham, D. Cosgrove, G. Chung, et al
-
[20]
Annals of Oncology : Official Journal of the European Society for Medical Oncology 32 (9): 1167–77
Clinical Validation of a Targeted Methylation-Based Multi-Cancer Early Detection Test Using an Independent Validation Set. Annals of Oncology : Official Journal of the European Society for Medical Oncology 32 (9): 1167–77. https://doi.org/10.1016/j.annonc.2021.05.806
-
[21]
Estimating stage-specific sensitivity for cancer screening tests
Pinsky P, Lange J, Etzioni R. Estimating stage-specific sensitivity for cancer screening tests. Journal of Medical Screening. 2023;30(2):69-73. doi:10.1177/09691413231154801
-
[22]
Bhatt R, vanden Hout A, Pashayan N. A multistate survival model of the natural history of cancer using data from screened and unscreened population. Statistics in Medicine. 2021;40:3791–3807. https://doi.org/10.1002/sim.8998
-
[23]
Maple2025.0, Maplesoft, Waterloo, Canada
-
[24]
Lemoine NP. Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos. 128: 912–928, 2019 doi: 10.1111/oik.05985
-
[25]
https://CRAN.R-project.org/package=R2jags
-
[26]
Rank-Normalization, Folding, and Localization: An Improved R for Assessing Convergence of MCMC (with Discussion)
Vehtari A, Gelman A, Simpson D, Carpenter B, Burkner PC. Rank-Normalization, Folding, and Localization: An Improved R for Assessing Convergence of MCMC (with Discussion). Bayesian Analysis (2021) 16, Number 2, pp. 667–718
2021
-
[27]
On structural and practical identifiability
Wieland FG, Hauber AL, Rosenblatt M, Tönsing C, Timmer J. On structural and practical identifiability. Current Opinion in Systems Biology, 2021 25:60-69. https://doi.org/10.1016/j.coisb.2021.03.005. 17/26
-
[28]
Bayesian Data Analysis
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. CRC Press; 2013
2013
-
[29]
J., Best, N
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4), 583-639
2002
-
[30]
Gogebakan KC, Lange J, Owens L, Pinderup A, Gulati R, Kessler LG, Lyratzopoulos G, Etzioni R. 2025. Clinical Significance of a Multicancer Screening Trial With Stage-Based End Points. JAMA Network Open 8 (10): e2536247. https://doi.org/10.1001/jamanetworkopen.2025.36247
-
[31]
Feasibility of blood testing combined with PET-CT to screen for cancer and guide intervention
Lennon AM, Buchanan AH, Kinde I, Warren A, Honushefsky A, Cohain AT, Ledbetter DH, Sanfilippo F, Sheridan K, Rosica D, Adonizio CS, Hwang HJ, Lahouel K, Cohen JD, Douville C, Patel AA, Hagmann LN, Rolston DD, Malani N, Zhou S, Bettegowda C, Diehl DL, Urban B, Still CD, Kann L, Woods JI, Salvati ZM, Vadakara J, Leeming R, Bhattacharya P, Walter C, Parker A...
-
[32]
https://investors.grail.com/node/8281/pdf, February 2026
2026
-
[33]
Briggs AH, Weinstein MC, Fenwick EAL, Karnon J, Sculpher MJ, Paltiel AD, on Behalf of the ISPOR-SMDM Modeling Good Research Practices Task Force “Model Parameter Estimation and Uncertainty: A Report of the ISPOR-SMDM Modeling Good Research Practices Task Force-6” Value in Health 15 (2012) 835– 842 doi: 10.1016/j.jval.2012.04.014 18/26 Table 1: Exemplar ag...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.