Dual Model Deep Learning for Alzheimer Prognostication
Pith reviewed 2026-05-16 20:22 UTC · model grok-4.3
The pith
A dual deep learning model turns one baseline CSF biomarker reading into individualized Alzheimer's prognosis with calibrated uncertainty.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PROGRESS uses a probabilistic trajectory network to forecast individualized cognitive decline trajectories with calibrated uncertainty and pairs it with a deep survival model that estimates time to MCI-to-dementia conversion. Trained on NACC data from over 3000 participants, the combined system outperforms Cox proportional hazards, random survival forests, and gradient boosting while remaining robust under leave-one-center-out validation across heterogeneous sites and assay technologies spanning four decades.
What carries the argument
The dual-model PROGRESS framework: one probabilistic trajectory network for decline paths with uncertainty and one deep survival model for time-to-conversion, both driven by a single baseline CSF signature.
If this is right
- Clinicians could prioritize patients for disease-modifying therapies at the first visit using risk strata that differ by a factor of seven in conversion rates.
- Probabilistic trajectory outputs allow honest communication of uncertainty rather than point estimates that overstate precision.
- Leave-one-center-out robustness implies the approach can be deployed across sites with varying measurement protocols and historical assay changes.
- Elimination of the need for repeated visits before prognosis reduces delay between biomarker measurement and treatment decision.
Where Pith is reading between the lines
- The same single-signature approach might be tested on other neurodegenerative diseases where baseline fluid markers are available but longitudinal follow-up is costly.
- Integration into electronic health records could flag high-risk individuals for earlier specialist referral even when full clinical history is incomplete.
- Future work could examine whether adding one additional low-cost variable (e.g., age or APOE status) further tightens the uncertainty bounds without requiring new longitudinal data.
Load-bearing premise
A single baseline cerebrospinal fluid biomarker assessment contains enough information to produce accurate long-term prognostic estimates without prior clinical history or any longitudinal observations.
What would settle it
A new external cohort collected under different assay conditions or from a demographically distinct population where the model's discrimination or calibration falls below the reported levels would falsify the generalizability claim.
Figures
read the original abstract
Disease modifying therapies for Alzheimer's disease demand precise timing decisions, yet current predictive models require longitudinal observations and provide no uncertainty quantification, rendering them impractical at the critical first visit when treatment decisions must be made. We developed PROGRESS (PRognostic Generalization from REsting Static Signatures), a dual-model deep learning framework that transforms a single baseline cerebrospinal fluid biomarker assessment into actionable prognostic estimates without requiring prior clinical history. The framework addresses two complementary clinical questions: a probabilistic trajectory network predicts individualized cognitive decline with calibrated uncertainty bounds achieving near-nominal coverage, enabling honest prognostic communication; and a deep survival model estimates time to conversion from mild cognitive impairment to dementia. Using data from over 3,000 participants across 43 Alzheimer's Disease Research Centers in the National Alzheimer's Coordinating Center database, PROGRESS substantially outperforms Cox proportional hazards, Random Survival Forests, and gradient boosting methods for survival prediction. Risk stratification identifies patient groups with seven-fold differences in conversion rates, enabling clinically meaningful treatment prioritization. Leave-one-center-out validation demonstrates robust generalizability, with survival discrimination remaining strong across held-out sites despite heterogeneous measurement conditions spanning four decades of assay technologies. By combining superior survival prediction with trustworthy trajectory uncertainty quantification, PROGRESS bridges the gap between biomarker measurement and personalized clinical decision-making.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces PROGRESS, a dual-model deep learning framework for Alzheimer's prognostication. It consists of a probabilistic trajectory network that predicts individualized cognitive decline trajectories with calibrated uncertainty from a single baseline CSF biomarker assessment, and a deep survival model that estimates time to conversion from mild cognitive impairment to dementia. Using data from over 3,000 participants across 43 NACC centers, the framework claims superior survival prediction performance over Cox proportional hazards, random survival forests, and gradient boosting baselines, with leave-one-center-out validation demonstrating generalizability and risk stratification yielding seven-fold differences in conversion rates.
Significance. If the reported empirical gains and calibration hold under the described validation, the work could meaningfully advance early-stage AD clinical decision support by enabling uncertainty-aware predictions at the first visit without longitudinal observations. The multi-center scale, explicit handling of site heterogeneity over four decades of assay changes, and provision of per-center performance tables represent concrete strengths for reproducibility and generalizability claims.
major comments (2)
- [Section 4.2] Section 4.2 (survival model results): the seven-fold risk separation claim requires explicit reporting of the hazard ratios or cumulative incidence curves for the stratified groups (e.g., high- vs. low-risk tertiles) together with confidence intervals; without these, the clinical actionability statement remains difficult to evaluate against the baseline methods.
- [Section 3.3] Section 3.3 (trajectory network calibration): the assertion of 'near-nominal coverage' must be supported by tabulated empirical coverage rates at the 68%, 80%, and 95% levels across the held-out centers; a single aggregate figure is insufficient to confirm honest uncertainty quantification under site heterogeneity.
minor comments (3)
- [Abstract] Abstract: numerical performance values (C-index, coverage percentages) are omitted despite being present in the full text; adding one or two key metrics would improve immediate readability.
- [Figure 3] Figure 3 (risk stratification): the Kaplan-Meier curves for the seven-fold groups should include the number at risk at each time point and log-rank p-values to allow direct comparison with the Cox and RSF baselines.
- [Section 2.1] Notation in Section 2.1: the symbol for the uncertainty bound (e.g., σ_t) is introduced without an explicit definition linking it to the probabilistic output layer; a short equation clarifying the coverage construction would remove ambiguity.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation and constructive suggestions. We address each major comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Section 4.2] Section 4.2 (survival model results): the seven-fold risk separation claim requires explicit reporting of the hazard ratios or cumulative incidence curves for the stratified groups (e.g., high- vs. low-risk tertiles) together with confidence intervals; without these, the clinical actionability statement remains difficult to evaluate against the baseline methods.
Authors: We agree that explicit hazard ratios and cumulative incidence curves with confidence intervals for the risk-stratified groups would improve interpretability. In the revised manuscript we will add these quantities (computed for high- versus low-risk tertiles) together with 95% confidence intervals and direct comparisons against the Cox, random survival forest, and gradient-boosting baselines. revision: yes
-
Referee: [Section 3.3] Section 3.3 (trajectory network calibration): the assertion of 'near-nominal coverage' must be supported by tabulated empirical coverage rates at the 68%, 80%, and 95% levels across the held-out centers; a single aggregate figure is insufficient to confirm honest uncertainty quantification under site heterogeneity.
Authors: We accept the request for granular calibration diagnostics. The revised Section 3.3 will include a table reporting empirical coverage at the 68%, 80%, and 95% levels for every held-out center in the leave-one-center-out evaluation, thereby demonstrating calibration stability across sites. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's central claims rest on empirical training and validation of a dual-model framework (trajectory network + deep survival model) using the external NACC database (>3000 participants, 43 centers). Leave-one-center-out validation, explicit architectures, loss functions, calibration procedures, and per-center performance tables are provided to support C-index gains and risk stratification. No load-bearing steps reduce predictions to fitted parameters defined by the authors' prior work, self-citations, or ansatzes smuggled via citation. The single-baseline input restriction and uncertainty quantification are implemented directly from the data without self-definitional equivalence. This is a standard empirical ML validation setup with no reduction of outputs to inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Acosta, J. N., Falcone, G. J., Rajpurkar, P., and Topol, E. J. (2022). Multimodal machine learning in precision health: A scoping review.npj Digital Medicine, 5:171
work page 2022
-
[2]
Aisen, P. S., Jimenez-Maggiora, G. A., Rafii, M. S., Walter, S., and Raman, R. (2024). Early-stage Alzheimer disease: Getting trial-ready.Nature Reviews Drug Discovery, 23(6):389–415
work page 2024
- [3]
-
[4]
Alp, S., Akan, T., Bhuiyan, M. S., Disbrow, E. A., Conrad, S. A., Vanchiere, J. A., Kevil, C. G., and Bhuiyan, M. A. N. (2024). Transformer model for Alzheimer’s disease progression prediction using longitudinal visit sequences.arXiv preprint arXiv:2507.03899
-
[5]
Basaia, S., Agosta, F., Wagner, L., Canu, E., Magnani, G., Santangelo, R., and Filippi, M. (2024). A novel CNN architecture for accurate early detection and classification of Alzheimer’s disease using MRI data.Scientific Reports, 14:3071
work page 2024
-
[6]
M., Xiao, C., Zhang, X., Wang, F., Jain, A
Baytas, I. M., Xiao, C., Zhang, X., Wang, F., Jain, A. K., and Zhou, J. (2017). Patient subtyping via time-aware LSTM networks. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 65–74
work page 2017
-
[7]
Blennow, K. and Zetterberg, H. (2023). Biomarkers in Alzheimer’s disease: Past, present and future clinical use. Current Alzheimer Research, 20(4):234–245
work page 2023
-
[8]
Carrasco-Ribelles, L. A., Cabrera-Bean, M., et al. (2024). The use of deep learning and machine learning on longitudinal electronic health records for the early detection and prevention of diseases: Scoping review. Journal of Medical Internet Research, 26:e48320
work page 2024
-
[9]
Che, Z., Purushotham, S., Cho, K., Sontag, D., and Liu, Y . (2018). Recurrent neural networks for multivariate time series with missing values.Scientific Reports, 8:6085
work page 2018
-
[10]
Choi, E., Bahadori, M. T., Kulas, J. A., Schuetz, A., Stewart, W. F., and Sun, J. (2016). RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism. InAdvances in Neural Information Processing Systems, 29:3504–3512
work page 2016
-
[11]
Cox, D. R. (1972). Regression models and life-tables.Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202
work page 1972
-
[12]
Dai, Z., Yan, C., Li, K., Wang, Z., Wang, J., Cao, M., et al. (2015). Identifying and mapping connectivity patterns of brain network hubs in Alzheimer’s disease.Cerebral Cortex, 25(10):3723–3742. Del Campo, M., Mollenhauer, B., Bertolotto, A., Engelborghs, S., Hampel, H., Simonsen, A. H., and Teunissen, C. E. (2022). Recommendations to standardize preanal...
work page 2015
-
[13]
El-Assy, A., Zayed, H. H., and Elhoseny, M. (2023). A novel approach utilizing machine learning for the early diagnosis of Alzheimer’s disease.Biomedical Materials & Devices, 1(1):1–14. 32
work page 2023
-
[14]
El-Sappagh, S., Alonso, J. M., Islam, S. M. R., Sultan, A. M., and Kwak, K. S. (2021). A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer’s disease.Scientific Reports, 11:2660
work page 2021
-
[15]
V ., Karwowski, W., and Lighthall, N
Farahani, F. V ., Karwowski, W., and Lighthall, N. R. (2022). The trend of disruption in the functional brain network topology of Alzheimer’s disease.Scientific Reports, 12:14998
work page 2022
-
[16]
Feng, X., Provenzano, F. A., Small, S. A., and Alzheimer’s Disease Neuroimaging Initiative (2022). A deep learning MRI approach outperforms other biomarkers of prodromal Alzheimer’s disease.Alzheimer’s Research & Therapy, 14:45
work page 2022
-
[17]
Gal, Y . and Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. InInternational Conference on Machine Learning, pages 1050–1059. GBD 2019 Dementia Forecasting Collaborators (2022). Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: An analysis for the Global ...
work page 2016
-
[18]
Ghassemi, M., Oakden-Rayner, L., and Beam, A. L. (2021). The false hope of current approaches to explainable artificial intelligence in health care.The Lancet Digital Health, 3(11):e745–e750
work page 2021
-
[19]
Harrell, F. E., Lee, K. L., and Mark, D. B. (1996). Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in Medicine, 15(4):361–387
work page 1996
-
[20]
Gaussian Error Linear Units (GELUs)
Hendrycks, D. and Gimpel, K. (2016). Gaussian error linear units (GELUs).arXiv preprint arXiv:1606.08415
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[21]
Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A., and Van Der Laan, M. J. (2006). Survival ensembles. Biostatistics, 7(3):355–373
work page 2006
-
[22]
Ishwaran, H., Kogalur, U. B., Blackstone, E. H., and Lauer, M. S. (2008). Random survival forests.The Annals of Applied Statistics, 2(3):841–860
work page 2008
-
[23]
M., Sperling, R., and Contributors (2018)
Scheltens, P., Siemers, E., Snyder, H. M., Sperling, R., and Contributors (2018). NIA-AA Research Framework: Toward a biological definition of Alzheimer’s disease.Alzheimer’s & Dementia, 14(4):535–562
work page 2018
-
[24]
Johnson, W. E., Li, C., and Rabinovic, A. (2007). Adjusting batch effects in microarray expression data using empirical Bayes methods.Biostatistics, 8(1):118–127
work page 2007
-
[25]
L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., and Kluger, Y
Katzman, J. L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., and Kluger, Y . (2018). DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network.BMC Medical Research Methodology, 18:24
work page 2018
-
[26]
Kendall, A. and Gal, Y . (2017). What uncertainties do we need in Bayesian deep learning for computer vision? In Advances in Neural Information Processing Systems, 30:5574–5584
work page 2017
-
[27]
Kompa, B., Snoek, J., and Beam, A. L. (2021). Second opinion needed: Communicating uncertainty in medical machine learning.npj Digital Medicine, 4:4
work page 2021
-
[28]
Lee, G., Nho, K., Kang, B., et al. (2019). Predicting Alzheimer’s disease progression using multi-modal deep learning approach.Scientific Reports, 9:1952. 33
work page 2019
-
[29]
Lee, C., Yoon, J., and van der Schaar, M. (2020). Dynamic-DeepHit: A deep learning approach for dynamic sur- vival analysis with competing risks based on longitudinal data.IEEE Transactions on Biomedical Engineering, 67(1):122–133
work page 2020
-
[30]
Lin, M., Gong, P., Yang, T., et al. (2018). Big data analytical approaches to the NACC dataset: Aiding preclinical trial enrichment.Alzheimer Disease & Associated Disorders, 32(1):18–27
work page 2018
-
[31]
Lipkova, J., Chen, R. J., Chen, B., et al. (2024). Multimodal machine learning in image-based and clinical biomedicine: Survey and prospects.International Journal of Computer Vision
work page 2024
-
[32]
Loshchilov, I. and Hutter, F. (2019). Decoupled weight decay regularization. InInternational Conference on Learning Representations
work page 2019
-
[33]
Maheux, K., Bachoud-Lévi, A. C., and Younes, L. (2023). Forecasting individual progression trajectories in Alzheimer’s disease.Nature Communications, 14:761
work page 2023
-
[34]
Markus, A. F., Kors, J. A., and Rijnbeek, P. R. (2021). The role of explainability in creating trustworthy artificial intelligence for health care: A comprehensive survey of the terminology, design choices, and evaluation strategies.Journal of Biomedical Informatics, 113:103655
work page 2021
-
[35]
S., Mahzarnia, A., Stout, J., Anderson, R
Moon, H. S., Mahzarnia, A., Stout, J., Anderson, R. J., Han, Z. Y ., Tremblay, J. T., and Badea, A. (2024). Feature attention graph neural network for estimating brain age and identifying important neural connections in mouse models of genetic risk for Alzheimer’s disease.Imaging Neuroscience, 2:1–22
work page 2024
-
[36]
Myszczynska, M. A., Ojamies, P. N., Lacoste, A. M., Neil, D., Saffari, A., Mead, R., and Al-Chalabi, A. (2024). Understanding machine learning applications in dementia research and clinical practice: a review for biomedical scientists and clinicians.Alzheimer’s Research & Therapy, 16:154
work page 2024
-
[37]
Nguyen, M., He, T., An, L., Alexander, D. C., Feng, J., and Yeo, B. T. (2021). Predicting Alzheimer’s disease progression using deep recurrent neural networks.NeuroImage, 222:117203
work page 2021
-
[38]
Palmqvist, S., Tideman, P., Cullen, N., Zetterberg, H., Blennow, K., Alzheimer’s Disease Neuroimaging Initiative, and Hansson, O. (2024). Disease staging of Alzheimer’s disease using a CSF-based biomarker model.Nature Aging, 4(3):342–355
work page 2024
-
[39]
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., et al. (2019). PyTorch: An imperative style, high-performance deep learning library. InAdvances in Neural Information Processing Systems, 32:8026–8037. Pölsterl, S. (2020). scikit-survival: A library for time-to-event analysis built on top of scikit-learn.Journal of Machine Learning Re...
work page 2019
-
[40]
Qiu, S., Joshi, P. S., Miller, M. I., et al. (2022). Multimodal deep learning for Alzheimer’s disease dementia assessment.Nature Communications, 13:3404
work page 2022
-
[41]
Si, Y ., Du, J., Li, Z., et al. (2021). Deep representation learning of patient data from electronic health records (EHR): A systematic review.Journal of Biomedical Informatics, 115:103671
work page 2021
-
[42]
A., Beheshti, I., Goel, T., Ahmad, N., Lai, K
Tanveer, M., Ganaie, M. A., Beheshti, I., Goel, T., Ahmad, N., Lai, K. T., and Alzheimer’s Disease Neuroimaging Initiative (2024). Evaluation of machine learning models for the prediction of Alzheimer’s: In search of the best performance.Artificial Intelligence in Medicine, 151:102876
work page 2024
-
[43]
Tao, S., Zhang, T., Weerakoon, J., et al. (2018). Predictive modeling of the progression of Alzheimer’s disease with recurrent neural networks.Scientific Reports, 8:9161. 34
work page 2018
-
[44]
Tao, S., Zhang, T., Yang, J., Wang, X., and Lu, W. (2018). Predicting progression of Alzheimer’s disease using routinely collected clinical data.BMC Medical Informatics and Decision Making, 18:1–9
work page 2018
-
[45]
E., Blennow, K., Apostolova, L., and Zetterberg, H
Teunissen, C. E., Blennow, K., Apostolova, L., and Zetterberg, H. (2024). Challenges in the practical implementa- tion of blood biomarkers for Alzheimer’s disease.The Lancet Healthy Longevity, 5(3):e157–e166
work page 2024
-
[46]
Udeh-Momoh, C., Price, G., Ropacki, M., Ketter, N., Blennow, K., Zetterberg, H., and Novak, G. (2024). Association of polygenic risk score for 5 diseases with Alzheimer disease progression, biomarkers, and amyloid deposition.Neurology, 103(2):e210250. U.S. Food and Drug Administration (2021). Artificial Intelligence and Machine Learning (AI/ML)-Based Soft...
work page 2024
-
[47]
Venugopalan, J., Tong, L., Hassanzadeh, H. R., and Wang, M. D. (2022). Generalizable deep learning model for early Alzheimer’s disease detection from structural MRIs.Scientific Reports, 12:17187
work page 2022
-
[48]
J., van Calster, B., and Steyerberg, E
Vickers, A. J., van Calster, B., and Steyerberg, E. W. (2019). A simple, step-by-step guide to interpreting decision curve analysis.Diagnostic and Prognostic Research, 3:18. V oyle, N., Patel, H., Folarin, A., Mattsson-Carlgren, N., Kim, C. H., Hye, A., and Nevado-Holgado, A. J. (2025). A cerebrospinal fluid synaptic protein biomarker for prediction of co...
work page 2019
-
[49]
Wang, L., Sha, L., Lakin, J. R., Bynum, J., Bates, D. W., Hong, P., and Zhou, L. (2018). Development and validation of a deep learning algorithm for mortality prediction in selecting patients with dementia for earlier palliative care interventions.JAMA Network Open, 2(7):e196972. World Health Organization (2023). Dementia fact sheet. Retrieved from https:...
work page 2018
-
[50]
Wiegrebe, S., Kopper, P., Sonabend, R., Bischl, B., and Bender, A. (2024). Deep learning for survival analysis: A review.Artificial Intelligence Review, 57:65
work page 2024
-
[51]
L., Frisell, O., and Prince, M
Wimo, A., Seeher, K., Cataldi, R., Cyhlarova, E., Dielemann, J. L., Frisell, O., and Prince, M. (2023). The worldwide costs of dementia in 2019.Alzheimer’s & Dementia, 19(7):2865–2873
work page 2023
-
[52]
Xiao, C., Choi, E., and Sun, J. (2018). Opportunities and challenges in developing deep learning models using electronic health records data: A systematic review.Journal of the American Medical Informatics Association, 25(10):1419–1428
work page 2018
-
[53]
Yi, F., Yang, H., Chen, D., Qin, Y ., Han, H., Cui, J., and Yu, H. (2023). XGBoost-SHAP-based interpretable diagnostic framework for Alzheimer’s disease.BMC Medical Informatics and Decision Making, 23:137
work page 2023
-
[54]
Yuan, C., Linn, K. A., and Hubbard, R. A. (2023). Algorithmic fairness of machine learning models for Alzheimer disease progression.JAMA Network Open, 6(11):e2342203
work page 2023
-
[55]
Zhang, Y ., Yang, X., Ivy, J., and Chi, M. (2019). ATTAIN: Attention-based time-aware LSTM networks for disease progression modeling. InProceedings of the 28th International Joint Conference on Artificial Intelligence, pages 4369–4375
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.