pith. sign in

arxiv: 1907.00943 · v1 · pith:GHXFQJYNnew · submitted 2019-07-01 · 💻 cs.CV · eess.IV· q-bio.QM

Estimating brain age based on a healthy population with deep learning and structural MRI

Pith reviewed 2026-05-25 11:52 UTC · model grok-4.3

classification 💻 cs.CV eess.IVq-bio.QM
keywords brain age estimationdeep learningstructural MRIhealthy populationage predictionneuroimagingcognitive functionfrontal lobe
0
0 comments X

The pith

A deep learning model trained on 10,158 healthy brain MRIs estimates age with mean absolute error of 4.06 years.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper assembles a large heterogeneous collection of structural brain MRIs from healthy adults spanning ages 18 to 97 and trains a deep learning model to predict brain age from the images. The resulting estimates reach mean absolute errors of 4.06 years on a held-out portion of the data and 4.21 years on an independent lifespan dataset, both with correlations above 0.96, while earlier methods performed worse on the same external set. The work further maps which brain regions drive the predictions and shows that the numerical gap between estimated and actual age correlates with scores on cognitive tests. A sympathetic reader would care because an accurate, generalizable brain-age marker could serve as an early indicator of cognitive change if the performance holds on new populations.

Core claim

By training a convolutional network on a curated set of 10,158 structural MRIs drawn from multiple public sources and chosen for uniform adult age coverage, the authors obtain brain-age predictions whose mean absolute error is 4.06 years and whose Pearson correlation with chronological age is 0.970 on an internal test split; the same model yields 4.21 years MAE and 0.960 correlation on an independent evaluation set previously used by other groups. Attribution techniques identify the frontal lobe as a dominant contributor, with contribution patterns that shift across the lifespan, and the absolute difference between predicted and chronological age is shown to associate with neuropsychological

What carries the argument

A convolutional neural network trained for regression on whole-brain T1-weighted MRI volumes, paired with feature-attribution maps that localize predictive tissue.

If this is right

  • Brain-age estimates can be used to examine how cognitive performance changes across the adult lifespan.
  • The frontal lobe supplies the largest share of predictive information, with the spatial pattern of contributions varying by age group.
  • Larger absolute differences between estimated and chronological age correspond to lower scores on standard neuropsychological tests.
  • The approach demonstrates that both dataset size and age uniformity improve estimation accuracy over prior smaller or skewed collections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the age-gap measure proves stable, clinicians could one day compare an individual's brain age against population norms to flag accelerated aging.
  • The same training strategy could be applied to longitudinal scans of the same people to estimate personal rates of brain change.
  • Extending the model to include disease cohorts might reveal whether specific conditions produce characteristic regional deviations from healthy aging trajectories.

Load-bearing premise

The assembled multi-source dataset represents an unbiased sample of healthy brains with uniform age coverage, and the independent test set measures generalization without unaccounted distribution differences.

What would settle it

Re-evaluating the trained model on a new healthy cohort of several hundred adults drawn from a different scanner vendor or geographic population and obtaining a mean absolute error above 6 years would falsify the generalization claim.

Figures

Figures reproduced from arXiv: 1907.00943 by Frank A. Provenzano, Jie Yang, Scott A. Small, Xinyang Feng, Zachary C. Lipton.

Figure 1
Figure 1. Figure 1: The age distribution of the study population. A) The age distribution of the raw [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The convolution neural network architecture. The inputs are 3D brain volumes. [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The estimated age versus chronological age in an independent test set of adult [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The distribution of predicted ages in test-retest scans. [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of MAE across life-span. (A) Age estimated using the balanced dataset. [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The partial correlation coefficients of age [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The 3D iso-surfaces (0.8) of the age activation maps at different age groups. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The age activation maps at different age groups overlaid on the (Left) MNI152 [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: MRI 2D slice based age estimation. (Top row) The mean absolute error (MAE) of [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
read the original abstract

Numerous studies have established that estimated brain age, as derived from statistical models trained on healthy populations, constitutes a valuable biomarker that is predictive of cognitive decline and various neurological diseases. In this work, we curate a large-scale heterogeneous dataset (N = 10,158, age range 18 - 97) of structural brain MRIs in a healthy population from multiple publicly-available sources, upon which we train a deep learning model for brain age estimation. The availability of the large-scale dataset enables a more uniform age distribution across adult life-span for effective age estimation with no bias toward certain age groups. We demonstrate that the age estimation accuracy, evaluated with mean absolute error (MAE) and correlation coefficient (r), outperforms previously reported methods in both a hold-out test set reflective of the custom population (MAE = 4.06 years, r = 0.970) and an independent life-span evaluation dataset (MAE = 4.21 years, r = 0.960) on which a previous study has evaluated. We further demonstrate the utility of the estimated age in life-span aging analysis of cognitive functions. Furthermore, we conduct extensive ablation tests and employ feature-attribution techniques to analyze which regions contribute the most predictive value, demonstrating the prominence of the frontal lobe as well as pattern shift across life-span. In summary, we achieve superior age estimation performance confirming the efficacy of deep learning and the added utility of training with data both in larger number and more uniformly distributed than in previous studies. We demonstrate the regional contribution to our brain age predictions through multiple routes and confirm the association of divergence between estimated and chronological brain age with neuropsychological measures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Circularity Check

0 steps flagged

No circularity; performance metrics obtained from independent held-out and external test sets

full rationale

The paper's central results consist of MAE and correlation values computed on a held-out portion of the curated multi-source dataset plus a separate independent life-span evaluation set. No equations, loss functions, or procedures are defined such that the reported test metrics reduce by construction to training-set quantities or to parameters fitted on the same data. No self-citation chains are invoked to justify uniqueness or to substitute for empirical validation, and the derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central contribution is empirical: curation of a large balanced dataset and supervised training of a standard deep network. The ledger therefore contains typical supervised-learning assumptions rather than novel mathematical axioms or invented physical entities.

free parameters (1)
  • neural network architecture and training hyperparameters
    Architecture choices, learning rate, regularization, and other training settings are selected to optimize performance on the curated data.
axioms (2)
  • domain assumption Structural features in T1-weighted MRI are sufficient to predict chronological age in healthy adults
    The modeling pipeline presupposes that age-related structural changes are reliably encoded in the images.
  • domain assumption The multi-source public dataset is free of major selection biases and provides uniform coverage across the adult lifespan
    Uniform age distribution is presented as the key enabler of unbiased training.

pith-pipeline@v0.9.0 · 5850 in / 1401 out tokens · 51814 ms · 2026-05-25T11:52:06.129713+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

  1. [1]

    Steffener, C

    J. Steffener, C. Habeck, D. O’Shea, Q. Razlighi, L. Bherer, Y. Stern, Differences between chronological and brain age are related to education and self-reported physical activity, Neurobiology of Aging 40 (2016) 138–144. doi:http://dx.doi. org/10.1016/j.neurobiolaging.2016.01.014

  2. [2]

    Gaser, K

    C. Gaser, K. Franke, S. Klppel, N. Koutsouleris, H. Sauer, I. Alzheimer’s Dis- ease Neuroimaging, BrainAGE in mild cognitive impaired patients: Predict- ing the conversion to Alzheimer’s disease, PLOS ONE 8 (6) (2013) e67346. doi:https://doi.org/10.1371/journal.pone.0067346

  3. [3]

    H. G. Schnack, N. E. van Haren, M. Nieuwenhuis, H. E. H. Pol, W. Cahn, R. S. Kahn, Accelerated brain aging in schizophrenia: A longitudinal pat- tern recognition study, American Journal of Psychiatry 173 (6) (2016) 607–616. doi:https://doi.org/10.1176/appi.ajp.2015.15070922

  4. [4]

    J. H. Cole, R. Leech, D. J. Sharp, I. for the Alzheimer’s Disease Neuroimaging, Prediction of brain age suggests accelerated atrophy after traumatic brain injury, 24 Annals of Neurology 77 (4) (2015) 571–581.doi:https://doi.org/10.1002/ana. 24367

  5. [5]

    Al Zoubi, C

    O. Al Zoubi, C. Ki Wong, R. T. Kuplicki, H.-w. Yeh, A. Mayeli, H. Refai, M. Paulus, J. Bodurka, Predicting age from brain EEG signalsa machine learning approach, Frontiers in Aging Neuroscience 10 (184). doi:http://dx.doi.org/ 10.3389/fnagi.2018.00184

  6. [7]

    H. Li, T. D. Satterthwaite, Y. Fan, Brain age prediction based on resting- state functional connectivity patterns using convolutional neural networks, in: International Symposium on Biomedical Imaging (ISBI), 2018, pp. 101–104. doi:https://doi.org/10.1109/ISBI.2018.8363532

  7. [8]

    K. B. Walhovd, L. T. Westlye, I. Amlien, T. Espeseth, I. Reinvang, N. Raz, I. Agartz, D. H. Salat, D. N. Greve, B. Fischl, A. M. Dale, A. M. Fjell, Con- sistent neuroanatomical age-related volume differences across multiple samples, Neurobiology of Aging 32 (5) (2011) 916–932. doi:https://doi.org/10.1016/ j.neurobiolaging.2009.05.013

  8. [9]

    S. A. Valizadeh, J. Hnggi, S. Mrillat, L. Jncke, Age prediction on the basis of brain anatomical measures, Human Brain Mapping (2016) n/a–n/a doi:http: //dx.doi.org/10.1002/hbm.23434

  9. [10]

    J. H. Cole, R. P. K. Poudel, D. Tsagkrasoulis, M. W. A. Caan, C. Steves, T. D. Spector, G. Montana, Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker, NeuroImage doi:https://doi. org/10.1016/j.neuroimage.2017.07.059

  10. [11]

    Ashburner, A fast diffeomorphic image registration algorithm, NeuroImage 38 (1) (2007) 95–113

    J. Ashburner, A fast diffeomorphic image registration algorithm, NeuroImage 38 (1) (2007) 95–113. doi:https://doi.org/10.1016/j.neuroimage.2007.07. 007. 25

  11. [12]

    Franke, G

    K. Franke, G. Ziegler, S. Klppel, C. Gaser, Estimating the age of healthy subjects from T1-weighted MRI scans using kernel methods: Exploring the influence of various parameters, NeuroImage 50 (3) (2010) 883–892. doi:https://doi.org/ 10.1016/j.neuroimage.2010.01.005

  12. [13]

    Lancaster, R

    J. Lancaster, R. Lorenz, R. Leech, J. H. Cole, Bayesian optimization for neu- roimaging pre-processing in brain age classification and prediction, Frontiers in Aging Neuroscience 10 (28). doi:http://dx.doi.org/10.3389/fnagi.2018. 00028

  13. [14]

    Gutierrez Becker, T

    B. Gutierrez Becker, T. Klein, C. Wachinger, Gaussian process uncertainty in age estimation as a measure of brain abnormality, NeuroImage 175 (2018) 246–258. doi:https://doi.org/10.1016/j.neuroimage.2018.03.075

  14. [15]

    Konukoglu, B

    E. Konukoglu, B. Glocker, D. Zikic, A. Criminisi, Neighbourhood approximation using randomized forests, Medical Image Analysis 17 (7) (2013) 790–804. doi: https://doi.org/10.1016/j.media.2013.04.013

  15. [16]

    B. Wang, T. D. Pham, Mri-based age prediction using hidden Markov models, Journal of Neuroscience Methods 199 (1) (2011) 140–145. doi:https://doi.org/ 10.1016/j.jneumeth.2011.04.022

  16. [17]

    D. P. Varikuti, S. Genon, A. Sotiras, H. Schwender, F. Hoffstaedter, K. R. Patil, C. Jockwitz, S. Caspers, S. Moebus, K. Amunts, C. Davatzikos, S. B. Eickhoff, Evaluation of non-negative matrix factorization of grey matter in age prediction, NeuroImage 173 (2018) 394–410. doi:https://doi.org/10.1016/j. neuroimage.2018.03.007

  17. [18]

    J. R. Taylor, N. Williams, R. Cusack, T. Auer, M. A. Shafto, M. Dixon, L. K. Tyler, C. A. N. Cam, R. N. Henson, The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) data repository: Structural and functional MRI, MEG, and cognitive data from a cross-sectional adult lifespan sample, NeuroImage 144 (2017) 262–269. doi:https://doi.org/10.1016/j.neuroim...

  18. [19]

    D. Wei, K. Zhuang, Q. Chen, W. Yang, W. Liu, K. Wang, J. Sun, J. Qiu, Structural and functional MRI from a cross-sectional southwest university adult lifespan dataset (SALD), bioRxiv (2018) 177279 doi:https://doi.org/10.1101/ 177279. 26

  19. [20]

    Rodrigue, K

    K. Rodrigue, K. Kennedy, M. Devous, J. Rieck, A. Hebrank, R. Diaz-Arrastia, D. Mathews, D. Park, -amyloid burden in healthy aging: regional distribution and cognitive consequences, Neurology 78 (6) (2012) 387–395. doi:https://doi.org/ 10.1212/WNL.0b013e318245d295

  20. [21]

    D. S. Marcus, T. H. Wang, J. Parker, J. G. Csernansky, J. C. Morris, R. L. Buckner, Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults, Journal of cognitive neuroscience 19 (9) (2007) 1498–1507. doi:https://doi.org/10.1162/ jocn.2007.19.9.1498

  21. [22]

    X.-N. Zuo, J. S. Anderson, P. Bellec, R. M. Birn, B. B. Biswal, J. Blautzik, J. C. Breitner, R. L. Buckner, V. D. Calhoun, F. X. Castellanos, An open science resource for establishing reliability and reproducibility in functional connectomics, Scientific data 1 (2014) 140049. doi:https://doi.org/10.1038/sdata.2014.49

  22. [23]

    L. Wang, K. I. Alpert, V. D. Calhoun, D. J. Cobia, D. B. Keator, M. D. King, A. Kogan, D. Landis, M. Tallis, M. D. Turner, S. G. Potkin, J. A. Turner, J. L. Ambite, SchizConnect: Mediating neuroimaging databases on schizophrenia and related disorders for large-scale integration, NeuroImage 124 (Pt B) (2016) 1155–

  23. [24]

    doi:https://doi.org/10.1016/j.neuroimage.2015.06.065

  24. [25]

    K. A. Ellis, A. I. Bush, D. Darby, D. De Fazio, J. Foster, P. Hudson, N. T. Lautenschlager, N. Lenzo, R. N. Martins, P. Maruff, C. Masters, A. Milner, K. Pike, C. Rowe, G. Savage, C. Szoeke, K. Taddei, V. Villemagne, M. Woodward, D. Ames, The Australian Imaging, Biomarkers and Lifestyle (AIBL) study of ag- ing: methodology and baseline characteristics of 1...

  25. [26]

    D. S. Marcus, A. F. Fotenos, J. G. Csernansky, J. C. Morris, R. L. Buckner, Open access series of imaging studies: longitudinal MRI data in nondemented and demented older adults, Journal of cognitive neuroscience 22 (12) (2010) 2677–

  26. [27]

    doi:https://doi.org/10.1162/jocn.2009.21407

  27. [28]

    Marek, D

    K. Marek, D. Jennings, S. Lasch, A. Siderowf, C. Tanner, T. Simuni, C. Coffey, K. Kieburtz, E. Flagg, S. Chowdhury, et al., The Parkinson Progression Marker 27 Initiative (PPMI), Progress in Neurobiology 95 (4) (2011) 629–635. doi:http: //dx.doi.org/10.1016/j.pneurobio.2011.09.005

  28. [29]

    A. J. Holmes, M. O. Hollinshead, T. M. OKeefe, V. I. Petrov, G. R. Fariello, L. L. Wald, B. Fischl, B. R. Rosen, R. W. Mair, J. L. Roffman, J. W. Smoller, R. L. Buckner, Brain Genomics Superstruct Project initial data release with structural, functional, and behavioral measures, Scientific Data 2 (2015) 150031. doi:https: //doi.org/10.1038/sdata.2015.31

  29. [30]

    W. Liu, D. Wei, Q. Chen, W. Yang, J. Meng, G. Wu, T. Bi, Q. Zhang, X.-N. Zuo, J. Qiu, Longitudinal test-retest neuroimaging data from healthy young adults in southwest China, Scientific Data 4 (2017) 170017. doi:https://doi.org/10. 1038/sdata.2017.17

  30. [31]

    Maclaren, Z

    J. Maclaren, Z. Han, S. B. Vos, N. Fischbein, R. Bammer, Reliability of brain volume measurements: A test-retest dataset, Scientific Data 1. doi:https:// doi.org/10.1038/sdata.2014.37

  31. [32]

    J. G. Sled, A. P. Zijdenbos, A. C. Evans, A nonparametric method for automatic correction of intensity nonuniformity in MRI data, IEEE Transactions on Medical Imaging 17 (1) (1998) 87–97. doi:https://doi.org/10.1109/42.668698

  32. [33]

    Sgonne, A

    F. Sgonne, A. M. Dale, E. Busa, M. Glessner, D. Salat, H. K. Hahn, B. Fischl, A hybrid approach to the skull stripping problem in MRI, NeuroImage 22 (3) (2004) 1060–1075. doi:https://doi.org/10.1016/j.neuroimage.2004.03.032

  33. [34]

    Jenkinson, P

    M. Jenkinson, P. Bannister, M. Brady, S. Smith, Improved optimization for the robust and accurate linear registration and motion correction of brain images, NeuroImage 17 (2) (2002) 825–841. doi:https://doi.org/10.1006/nimg.2002. 1132

  34. [35]

    X. Feng, J. Yang, Z. C. Lipton, S. A. Small, F. A. Provenzano, Deep learning on MRI affirms the prominence of the hippocampal formation in Alzheimer’s disease classification, bioRxiv (2018) 456277 doi:10.1101/456277

  35. [36]

    Simonyan, A

    K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: International Conference on Machine Learning (ICLR), 2015. 28

  36. [37]

    J. Byrd, Z. Lipton, What is the effect of importance weighting in deep learning?, in: International Conference on Machine Learning, 2019, pp. 872–881

  37. [38]

    A. L. Benton, A. B. Sivan, K. d. Hamsher, N. R. Varney, O. Spreen, Contributions to neuropsychological assessment: A clinical manual, Oxford University Press, USA, 1994

  38. [39]

    Fischl, A

    B. Fischl, A. M. Dale, Measuring the thickness of the human cerebral cortex from magnetic resonance images, Proceedings of the National Academy of Sciences 97 (20) (2000) 11050–11055

  39. [40]

    B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba, Learning deep features for discriminative localization, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2921–2929. doi:http://doi.computer. org/10.1109/CVPR.2016.319

  40. [41]

    R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad- CAM: Visual explanations from deep networks via gradient-based localization, in: IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. doi:http://doi.computer.org/10.1109/ICCV.2017.74

  41. [42]

    X. Feng, J. Yang, A. F. Laine, E. D. Angelini, Discriminative localization in CNNs for weakly-supervised segmentation of pulmonary nodules, in: In- ternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer International Publishing, 2017, pp. 568–576. doi:https://doi.org/10.1007/978-3-319-66179-7_65

  42. [43]

    Gazzaley, J

    A. Gazzaley, J. W. Cooney, J. Rissman, M. D’Esposito, Top-down suppression deficit underlies working memory impairment in normal aging, Nature Neuro- science 8 (2005) 1298. doi:10.1038/nn1543

  43. [44]

    Chetelat, B

    G. Chetelat, B. Landeau, E. Salmon, I. Yakushev, M. A. Bahri, F. Mezenge, A. Perrotin, C. Bastin, A. Manrique, A. Scheurich, M. Scheckenberger, B. Des- granges, F. Eustache, A. Fellgiebel, Relationships between brain metabolism de- crease in normal aging and changes in structural and functional connectivity, NeuroImage 76 (1) (2013) 167–177. doi:10.1016/j...

  44. [45]

    S. P. Shamchi, M. Khosravi, R. Taghvaei, S. Emamzadehfard, K. Paydary, W. Raynor, M. Z. Zadeh, S. Castro, A. Nielsen, T. Werner, Alteration of normal regional brain fdg uptake in normal aging, Journal of Nuclear Medicine 58 (sup- plement 1) (2017) 483–483. 30 Supplementary Material Table S1: Distribution of number of scans per subject number of scans per ...