Continuous-Time Bayesian Networks with Structured Shrinkage Priors for Modelling Multimorbidity Trajectories in Large-Scale Electronic Health Records
Pith reviewed 2026-07-03 01:11 UTC · model grok-4.3
The pith
A continuous-time Bayesian network with spike-and-slab prior recovers two main multimorbidity disease modules from longitudinal EHR data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The spike-and-slab structured shrinkage prior within the continuous-time Bayesian network framework achieves superior network recovery and false-discovery control compared to other sparsity-inducing priors, and when applied to UK Biobank primary care records it identifies two dominant disease modules: a cardiometabolic cluster centred on diabetes and an inflammatory cluster linking respiratory and atopic conditions.
What carries the argument
Order-dependent shrinkage priors on the interaction parameters of a continuous-time Bayesian network that models disease transition intensities as functions of existing conditions and covariates.
If this is right
- The spike-and-slab prior outperforms continuous shrinkage priors for hard variable selection in network recovery.
- The selected model reveals clinically interpretable modules that can guide understanding of multimorbidity progression.
- Main effects remain interpretable while higher-order interactions are penalised.
- Continuous-time Markov assumption allows handling of irregular observation times in EHR data.
Where Pith is reading between the lines
- Similar structured priors could be tested in other longitudinal settings with high-dimensional interactions.
- The two identified clusters suggest that interventions targeting diabetes might have broader effects on cardiometabolic conditions.
- Future work could extend the model to include more conditions or genetic covariates.
- Validation in independent cohorts would strengthen the generalizability of the disease modules.
Load-bearing premise
The longitudinal EHR data can be adequately represented by the continuous-time Markov assumption and the chosen set of ten conditions without substantial bias from missing records, irregular times, or unmeasured confounders.
What would settle it
Applying the same model to an independent cohort with the same ten conditions and checking whether the same two disease modules are recovered with comparable accuracy.
read the original abstract
Multiple long-term conditions (MLTCs) arise through complex, time-dependent interactions among diseases, yet existing methods often struggle to jointly model disease progression, multimorbidity networks, and high-dimensional risk factors. We propose a structured Bayesian continuous-time Bayesian network (CTBN) framework for learning directed disease-dependency networks from longitudinal electronic health records. The model allows disease transition intensities to depend on existing conditions, pairwise disease interactions, and exogenous covariates. To control the combinatorial growth of interaction parameters, we introduce order-dependent shrinkage priors that increasingly penalise higher-order effects while preserving clinically interpretable main effects. We compare four sparsity-inducing priors, spike-and-slab, structured normal, Bayesian LASSO, and regularised horseshoe through extensive simulation studies. Across multiple data-generating scenarios, the spike-and-slab prior achieved the best network recovery, variable-selection accuracy, and false-discovery control, while continuous shrinkage priors were less effective for hard variable selection. The proposed framework was applied to UK Biobank primary care records, focusing on data from 33,558 participants who were free of the ten selected most prevalent conditions at age 40 and who subsequently developed at least one of these conditions during the follow-up period. The selected spike-and-slab model identified two dominant disease modules: a cardiometabolic cluster centred on diabetes and an inflammatory cluster linking respiratory and atopic conditions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a continuous-time Bayesian network (CTBN) model equipped with order-dependent structured shrinkage priors to infer directed disease dependency networks from longitudinal electronic health records (EHR). Through simulations, the spike-and-slab prior is shown to outperform structured normal, Bayesian LASSO, and regularised horseshoe priors in network recovery, variable selection, and false discovery control. The method is applied to a UK Biobank cohort of 33,558 participants, identifying a cardiometabolic module centered on diabetes and an inflammatory module involving respiratory and atopic conditions.
Significance. If the modeling assumptions hold, this provides a scalable Bayesian approach to jointly modeling disease progression and high-dimensional multimorbidity networks in EHR data. The extensive simulation comparisons across multiple data-generating scenarios and the large-scale real-data application are strengths that could advance methodology for longitudinal health records analysis.
major comments (3)
- [§4 (Simulation Studies)] §4 (Simulation Studies): The data-generating processes do not replicate the irregular observation times, right-censoring, or cohort selection rule (participants disease-free at age 40 who later develop at least one condition) used in the UK Biobank application; this limits the relevance of the reported superiority of spike-and-slab for network recovery to the real-data setting.
- [§5 (Real-Data Application)] §5 (Real-Data Application): No sensitivity analyses are presented for the continuous-time Markov assumption, the effects of missing records, or unmeasured confounders on the recovered network and the two identified modules (cardiometabolic and inflammatory).
- [Abstract and §3 (Model Specification)] Abstract and §3 (Model Specification): The parameterization of transition intensities (dependence on existing conditions, pairwise interactions, and covariates) is not detailed with explicit equations or identifiability conditions, which is load-bearing for interpreting the order-dependent shrinkage and the selected modules.
minor comments (2)
- [Abstract] The abstract would benefit from reporting quantitative simulation metrics (e.g., AUC or F1 for network recovery) rather than qualitative statements of superiority.
- [§3 (Model Specification)] Clarify the exact form of the order-dependent shrinkage prior (hyperparameters and how higher-order terms are penalised) with an early equation reference.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We address each major comment point by point below.
read point-by-point responses
-
Referee: The data-generating processes do not replicate the irregular observation times, right-censoring, or cohort selection rule (participants disease-free at age 40 who later develop at least one condition) used in the UK Biobank application; this limits the relevance of the reported superiority of spike-and-slab for network recovery to the real-data setting.
Authors: The simulation studies are designed to isolate and compare the performance of the four sparsity-inducing priors under controlled data-generating processes that vary in network structure, interaction order, and sample size. This design enables clear assessment of network recovery, variable selection, and false-discovery control without confounding by real-data complexities. The UK Biobank application separately employs the actual irregular observation times, right-censoring, and cohort selection rule described in §5. We will revise §4 to explicitly state the purpose of the simulations and note their limitations in replicating the real-data process. revision: yes
-
Referee: No sensitivity analyses are presented for the continuous-time Markov assumption, the effects of missing records, or unmeasured confounders on the recovered network and the two identified modules (cardiometabolic and inflammatory).
Authors: We agree that sensitivity analyses would be valuable. The continuous-time Markov assumption is foundational to the CTBN framework, and the large scale of the UK Biobank application (33,558 participants) limits the feasibility of extensive re-analyses. We will add a new subsection in §5 providing a qualitative discussion of potential impacts from assumption violations, missing records, and unmeasured confounders, including any feasible robustness checks that can be performed without prohibitive computation. revision: partial
-
Referee: The parameterization of transition intensities (dependence on existing conditions, pairwise interactions, and covariates) is not detailed with explicit equations or identifiability conditions, which is load-bearing for interpreting the order-dependent shrinkage and the selected modules.
Authors: We thank the referee for this observation. Section 3 specifies that transition intensities depend on existing conditions, pairwise interactions, and covariates, with order-dependent shrinkage applied to penalize higher-order terms. We will expand §3 to include the explicit intensity parameterization λ_{i→j}(t) = exp( baseline + main effects + interaction terms + covariate effects ) together with the identifiability constraints used for the shrinkage priors and module interpretation. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper proposes a structured Bayesian CTBN framework with order-dependent shrinkage priors, compares four priors via simulation studies under controlled data-generating processes, and applies the selected model to an external UK Biobank cohort. No equations, model definitions, or reported results (network recovery, variable selection, or identified disease modules) reduce by construction to quantities defined solely by fitted parameters or self-citations within the paper. The simulation benchmarks are independent of the real-data application, and the central empirical claims are outputs from fitting the model to held-out longitudinal records rather than tautological restatements of inputs. This is a standard modeling proposal with external validation; the derivation chain is self-contained against the stated assumptions and benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- prior hyperparameters for spike-and-slab, structured normal, Bayesian LASSO, regularised horseshoe
axioms (1)
- domain assumption Disease progression follows a continuous-time Markov process whose intensities depend on current state, pairwise interactions, and covariates.
Reference graph
Works this paper leans on
-
[1]
Nature communications , volume=
Temporal disease trajectories condensed from population-wide registry data covering 6.2 million patients , author=. Nature communications , volume=. 2014 , publisher=
2014
-
[2]
Nature communications , volume=
Disease trajectory browser for exploring temporal, population-wide disease progression patterns in 7.2 million Danish patients , author=. Nature communications , volume=. 2020 , publisher=
2020
-
[3]
Artificial intelligence and statistics , pages=
Handling sparsity via the horseshoe , author=. Artificial intelligence and statistics , pages=. 2009 , organization=
2009
-
[4]
Computers in Biology and Medicine , volume=
CTBN-PH: A continuous-time Bayesian network for individualised diagnostic risk prediction , author=. Computers in Biology and Medicine , volume=. 2025 , publisher=
2025
-
[5]
medRxiv , pages=
Longitudinal modeling of multimorbidity trajectories using large language models , author=. medRxiv , pages=. 2024 , publisher=
2024
-
[6]
Journal of the Royal Society of Medicine , volume=
Prevalence of multiple long-term conditions (multimorbidity) in England: a whole population study of over 60 million people , author=. Journal of the Royal Society of Medicine , volume=. 2024 , publisher=
2024
-
[7]
Scientific Reports , volume=
Chronic low-grade inflammation associated with higher risk and earlier onset of cardiometabolic multimorbidity in middle-aged and older adults: a population-based cohort study , author=. Scientific Reports , volume=. 2024 , publisher=
2024
-
[8]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Learning continuous-time bayesian networks in relational domains: A non-parametric approach , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[9]
Annals of Mathematics and Artificial Intelligence , volume=
Constraint-based learning for non-parametric continuous bayesian networks , author=. Annals of Mathematics and Artificial Intelligence , volume=. 2021 , publisher=
2021
-
[10]
Bayesian hierarchical spatial models: Implementing the Besag York Molli
Morris, Mitzi and Wheeler-Martin, Katherine and Simpson, Dan and Mooney, Stephen J and Gelman, Andrew and DiMaggio, Charles , journal=. Bayesian hierarchical spatial models: Implementing the Besag York Molli. 2019 , publisher=
2019
-
[11]
, title =
Gillespie, Daniel T. , title =. The Journal of Physical Chemistry , year =
-
[12]
Lewis, P. A. W. and Shedler, G. S. , title =. Naval Research Logistics Quarterly , year =
-
[13]
Norris, J. R. , title =
-
[14]
IEEE Transactions on Information Theory , year =
Ogata, Yosihiko , title =. IEEE Transactions on Information Theory , year =
-
[15]
and Borgan,
Andersen, Per K. and Borgan,. Statistical Models Based on Counting Processes , publisher =
-
[16]
and Gray, Robert J
Fine, Jason P. and Gray, Robert J. , title =. Journal of the American Statistical Association , year =
-
[17]
, title =
Putter, Hein and Fiocco, Marta and Geskus, Ronald B. , title =. Statistics in Medicine , year =
-
[18]
Time-Dependent
Blanche, Paul and Commenges, Daniel and Jacqmin-Gadda, H. Time-Dependent. Statistics in Medicine , year =
-
[19]
, title =
Pepe, Margaret S. , title =. Journal of the American Statistical Association , year =
-
[20]
2023 , note =
Blanche, Paul , title =. 2023 , note =
2023
-
[21]
and Berger, James O
Barbieri, Maria M. and Berger, James O. , title =. Annals of Statistics , year =
-
[22]
and Smith, Adrian F
Bernardo, Jose M. and Smith, Adrian F. M. , title =
-
[23]
, title =
Brier, Glenn W. , title =. Monthly Weather Review , year =
-
[24]
and Polson, Nicholas G
Carvalho, Carlos M. and Polson, Nicholas G. and Scott, James G. , title =. Biometrika , year =
-
[25]
, title =
Casella, George and Robert, Christian P. , title =. Biometrika , year =
-
[26]
Canadian Journal of Statistics , year =
Chipman, Hugh , title =. Canadian Journal of Statistics , year =
-
[27]
and Miller, H
Cox, David R. and Miller, H. D. , title =
-
[28]
and DeLong, David M
DeLong, Elizabeth R. and DeLong, David M. and Clarke-Pearson, Daniel L. , title =. Biometrics , year =
-
[29]
and McCulloch, Robert E
George, Edward I. and McCulloch, Robert E. , title =. Journal of the American Statistical Association , year =
-
[30]
, title =
Gneiting, Tilmann and Raftery, Adrian E. , title =. Journal of the American Statistical Association , year =
-
[31]
Hamada, Michael and Wu, C. F. Jeff , title =. Journal of Quality Technology , year =
-
[32]
, title =
McCullagh, Peter and Nelder, John A. , title =
-
[33]
and Beauchamp, John J
Mitchell, Toby J. and Beauchamp, John J. , title =. Journal of the American Statistical Association , year =
-
[34]
and Wedderburn, Robert W
Nelder, John A. and Wedderburn, Robert W. M. , title =. Journal of the Royal Statistical Society: Series A , year =
-
[35]
and Koller, Daphne , title =
Nodelman, Uri and Shelton, Christian R. and Koller, Daphne , title =. Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI) , year =
-
[36]
and Koller, Daphne , title =
Nodelman, Uri and Shelton, Christian R. and Koller, Daphne , title =. Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI) , year =
-
[37]
Journal of the American Statistical Association , year =
Park, Trevor and Casella, George , title =. Journal of the American Statistical Association , year =
-
[38]
Electronic Journal of Statistics , year =
Piironen, Juho and Vehtari, Aki , title =. Electronic Journal of Statistics , year =
-
[39]
Journal of the Royal Statistical Society: Series B , year =
Tibshirani, Robert , title =. Journal of the Royal Statistical Society: Series B , year =
-
[40]
and D'Agostino, Ralph B
Uno, Hajime and Cai, Tianxi and Pencina, Michael J. and D'Agostino, Ralph B. and Wei, Lee-Jen , title =. Statistics in Medicine , year =
-
[41]
Statistics and Computing , year =
Vehtari, Aki and Gelman, Andrew and Gabry, Jonah , title =. Statistics and Computing , year =
-
[42]
Journal of Machine Learning Research , year =
Vehtari, Aki and Simpson, Daniel and Gelman, Andrew and Yao, Yuling and Gabry, Jonah , title =. Journal of Machine Learning Research , year =
-
[43]
, title =
Vuong, Quang H. , title =. Econometrica , year =
-
[44]
Journal of Computational and Graphical Statistics , volume=
Structured shrinkage priors , author=. Journal of Computational and Graphical Statistics , volume=. 2024 , publisher=
2024
-
[45]
Chemometrics and Intelligent Laboratory Systems , volume=
Bayesian global-local shrinkage methods for regularisation in the high dimension linear model , author=. Chemometrics and Intelligent Laboratory Systems , volume=. 2021 , publisher=
2021
-
[46]
Epidemiologic Methods , volume=
Linked shrinkage to improve estimation of interaction effects in regression models , author=. Epidemiologic Methods , volume=. 2024 , publisher=
2024
-
[47]
Bayesian Analysis , volume=
Hierarchical shrinkage priors for regression models , author=. Bayesian Analysis , volume=. 2017 , publisher=
2017
-
[48]
Journal of Computational and Graphical Statistics , volume=
Bayesian function-on-scalars regression for high-dimensional data , author=. Journal of Computational and Graphical Statistics , volume=. 2020 , publisher=
2020
-
[49]
A Conceptual Introduction to Hamiltonian Monte Carlo
A conceptual introduction to Hamiltonian Monte Carlo , author=. arXiv preprint arXiv:1701.02434 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[50]
Journal of the american statistical association , volume=
The bayesian lasso , author=. Journal of the american statistical association , volume=. 2008 , publisher=
2008
-
[51]
Journal of the American Statistical Association , volume=
The spike-and-slab lasso , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=
2018
-
[52]
Biometrika , pages=
The horseshoe estimator for sparse signals , author=. Biometrika , pages=. 2010 , publisher=
2010
-
[53]
Electron
Sparsity information and regularization in the horseshoe and other shrinkage priors , author=. Electron. J. Statist. , volume=
-
[54]
Statistical Methods in Medical Research , volume=
A comparison of two frameworks for multi-state modelling, applied to outcomes after hospital admissions with COVID-19 , author=. Statistical Methods in Medical Research , volume=. 2022 , publisher=
2022
-
[55]
European heart journal , volume=
Big data from electronic health records for early and late translational cardiovascular research: challenges and potential , author=. European heart journal , volume=. 2018 , publisher=
2018
-
[56]
The Lancet Digital Health , volume=
A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service , author=. The Lancet Digital Health , volume=. 2019 , publisher=
2019
-
[57]
Nature reviews Disease primers , volume=
Multimorbidity , author=. Nature reviews Disease primers , volume=. 2022 , publisher=
2022
-
[58]
The Lancet , volume=
Epidemiology of multimorbidity and implications for health care, research, and medical education: a cross-sectional study , author=. The Lancet , volume=. 2012 , publisher=
2012
-
[59]
Journal of biomedical informatics , volume=
Continuous time Bayesian network classifiers , author=. Journal of biomedical informatics , volume=. 2012 , publisher=
2012
-
[60]
2007 , school=
Continuous time Bayesian networks , author=. 2007 , school=
2007
-
[61]
Journal of multivariate analysis , volume=
Generating random correlation matrices based on vines and extended onion method , author=. Journal of multivariate analysis , volume=. 2009 , publisher=
2009
-
[62]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Bayesian regression tree ensembles that adapt to smoothness and sparsity , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2018 , publisher=
2018
-
[63]
Statistics and Computing , volume=
Mixture of multivariate Gaussian processes for classification of irregularly sampled satellite image time-series , author=. Statistics and Computing , volume=. 2022 , publisher=
2022
-
[64]
, author=
Classification of Sparse and Irregularly Sampled Time Series with Mixtures of Expected Gaussian Kernels and Random Features. , author=. UAI , pages=
-
[65]
IEEE Access , volume=
A functional model for structure learning and parameter estimation in continuous time Bayesian network: An application in identifying patterns of multiple chronic conditions , author=. IEEE Access , volume=. 2021 , publisher=
2021
-
[66]
Journal of Comorbidity , volume=
Multimorbidity: an irregular, unpredictable and heterogeneous state , author=. Journal of Comorbidity , volume=. 2019 , publisher=
2019
-
[67]
BMC Public Health , volume=
Multimorbidity prevalence and patterns in a large primary care population in the United Kingdom , author=. BMC Public Health , volume=. 2023 , publisher=
2023
-
[68]
Journal of Epidemiology and Community Health , volume=
Temporal patterns of multimorbidity development in the UK Biobank , author=. Journal of Epidemiology and Community Health , volume=. 2024 , publisher=
2024
-
[69]
Lancet Regional Health Europe , volume=
Inflammatory conditions and cardiometabolic risk in the UK Biobank: a prospective cohort study , author=. Lancet Regional Health Europe , volume=. 2024 , publisher=
2024
-
[70]
Statistical Science , volume=
Lasso meets horseshoe: A survey , author=. Statistical Science , volume=. 2019 , publisher=
2019
-
[71]
Journal of the American Statistical Association , volume=
Variational Bayes for high-dimensional linear regression with sparse priors , author=. Journal of the American Statistical Association , volume=. 2022 , publisher=
2022
-
[72]
Fast, accurate and interpretable
Kowal, Daniel R , journal=. Fast, accurate and interpretable. 2023 , publisher=
2023
-
[73]
A continuous-time
Brizzi, Marco and Wiberg, Henrik and Henson, Karl E and Bhatt, Sunil and Bhaskaran, Krishnan , journal=. A continuous-time. 2023 , publisher=
2023
-
[74]
2024 , publisher=
Engelhardt, Barbara E and Adams, Ryan P , journal=. 2024 , publisher=
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.