Latent space projections and atlases: A cautionary tale in deep neuroimaging using autoencoders

C. Jimenez; F.J. Martinez; F. Segovia; J.E. Arco; J.M. Gorriz; J Ramirez; J. Suckling; S. Abulikemu

arxiv: 2509.03675 · v2 · pith:U6LXKTSSnew · submitted 2025-09-03 · 📊 stat.AP

Latent space projections and atlases: A cautionary tale in deep neuroimaging using autoencoders

J.M. Gorriz , F. Segovia , C. Jimenez , J.E. Arco , F.J. Martinez , J Ramirez , S. Abulikemu , J. Suckling This is my paper

Pith reviewed 2026-05-25 07:53 UTC · model grok-4.3

classification 📊 stat.AP

keywords autoencoderlatent spaceAlzheimer's diseaseneuroimagingLRCPAAL atlasinterpretabilitybrain MRI

0 comments

The pith

Even minimal autoencoders on ADNI brain MRI capture Alzheimer's progression patterns when paired with latent-regional correlation profiling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper trains a simple convolutional autoencoder on segmented gray matter images from the ADNI dataset to produce a compact latent space that preserves neuroanatomical structure while reflecting differences in cognitive status. The authors introduce the Latent-Regional Correlation Profiling (LRCP) framework, which combines statistical association measures with supervised discriminability to identify which regions defined by the AAL atlas carry clinically relevant information in that latent space. Dimensionality reduction techniques such as PCA, t-SNE, PLS, and UMAP are used to visualize the space, and SHAP regression on reconstruction error is applied post-hoc to highlight anatomically meaningful regions tied to class-specific reconstruction strategies. The central result is that these minimal architectures already extract patterns associated with progression to Alzheimer's disease, provided the latent space is interpreted through rigorous statistical checks rather than raw projections alone. The work positions autoencoders as exploratory tools for biomarker discovery while underscoring the need to guard against artifacts introduced by architecture, training, or post-processing choices.

Core claim

A simple convolutional autoencoder with hierarchical encoder and compact latent space, trained on ADNI gray matter images, learns representations that reflect clinical variability across cognitive status. The LRCP framework identifies brain regions encoding clinically relevant latent information by combining statistical association and supervised discriminability. Post-hoc SHAP analysis of reconstruction error from atlas-based regional intensities reveals anatomically meaningful regions involved in class-specific reconstruction, with results further validated by statistical agnostic methods.

What carries the argument

The Latent-Regional Correlation Profiling (LRCP) framework, which integrates statistical association between latent dimensions and atlas regions with supervised discriminability scores to isolate brain areas that carry clinically relevant information.

If this is right

Even minimal autoencoder architectures capture meaningful patterns associated with progression to Alzheimer's disease.
LRCP can locate brain regions that encode clinically relevant latent information from the model.
SHAP regression on reconstruction error can highlight anatomically meaningful regions for different clinical classes.
Autoencoders can function as exploratory tools for biomarker discovery and hypothesis generation in clinical neuroscience.
Multiple statistical validation methods are required to ensure interpretations are not driven by methodological artifacts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If LRCP generalizes across datasets, it could be applied to other neurological conditions to surface candidate biomarkers from latent spaces.
The cautionary framing implies that raw latent projections onto atlases can mislead without the added discriminability step.
Testing LRCP on autoencoders with altered training objectives would reveal whether the identified regions depend on the specific reconstruction loss.
The approach could be extended to compare latent spaces across different imaging modalities to check consistency of regional encoding.

Load-bearing premise

Observed correlations between latent dimensions and atlas regions reflect genuine neuroanatomical encoding of clinical status rather than artifacts from the autoencoder architecture, training procedure, or post-hoc dimensionality reduction.

What would settle it

Demonstrating that the regional correlations identified by LRCP disappear or reverse when the same data are processed with a linear dimensionality reduction method or when reconstruction errors are randomly permuted would falsify the claim.

Figures

Figures reproduced from arXiv: 2509.03675 by C. Jimenez, F.J. Martinez, F. Segovia, J.E. Arco, J.M. Gorriz, J Ramirez, J. Suckling, S. Abulikemu.

**Figure 1.** Figure 1: Overview of analysis methods to provided interpretability of the latent space. FA: feature atribution; NCC: [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Overview of the neural architecture based on Autoencoder (AE) [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of training loss across epochs (1 to 10) for each comparison group including NOR, AD, MCI, [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Reconstruction quality using MSE (10 epochs) and the combined loss (20 epochs). [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: PCA projection of Layer 1, 2 and latent activations [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: t-SNE projection of Layer 1, 2 and latent activations [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Fused neuroanatomical visualization of significant latent-to-anatomy correlations (PCA method, component [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Fused neuroanatomical visualization of significant latent-to-anatomy correlations (t-sne method, component [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

**Figure 9.** Figure 9: Fused neuroanatomical visualization of SHAP values mapped to anatomy: top row shows NOR (left) and [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: Correlation analysis including all comparisons and anatomical AAL regions is shown for the normal class [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗

**Figure 11.** Figure 11: AAL regions ranked by SHAP importance for class 0 (NOR) when compared with AD. Regions like [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗

**Figure 12.** Figure 12: SHAP regions for class 3 (AD), reflecting strong contributions from the Frontal [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗

**Figure 13.** Figure 13: Violin plot of SHAP values across subjects for class 0 (NOR) and class 3 (AD), showing distributional [PITH_FULL_IMAGE:figures/full_fig_p024_13.png] view at source ↗

**Figure 14.** Figure 14: Distribution of SHAP values for the 10 most relevant AAL regions in the NOR–MCI comparison. Each [PITH_FULL_IMAGE:figures/full_fig_p025_14.png] view at source ↗

**Figure 15.** Figure 15: Correlation importance for the NOR–MCIc comparison, showing (top) raw correlation values between [PITH_FULL_IMAGE:figures/full_fig_p026_15.png] view at source ↗

**Figure 16.** Figure 16: Summary of significant and non-significant regions for t-SNE and UMAP by group and latent (adding [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗

**Figure 17.** Figure 17: LRCP analysis for region number 34, ’Cingulum [PITH_FULL_IMAGE:figures/full_fig_p028_17.png] view at source ↗

**Figure 18.** Figure 18: LRCP analysis for the three binary groups showing the evolution of disease from MCI to AD. Results are [PITH_FULL_IMAGE:figures/full_fig_p029_18.png] view at source ↗

**Figure 19.** Figure 19: PLS projection of Layer 1, 2 and latent activations [PITH_FULL_IMAGE:figures/full_fig_p030_19.png] view at source ↗

**Figure 20.** Figure 20: UMAP projection of Layer 1, 2 and latent activations [PITH_FULL_IMAGE:figures/full_fig_p031_20.png] view at source ↗

**Figure 21.** Figure 21: Fused neuroanatomical visualization of significant latent-to-anatomy correlations (PCA and t-SNE meth [PITH_FULL_IMAGE:figures/full_fig_p032_21.png] view at source ↗

**Figure 22.** Figure 22: Distribution of the ten most relevant AAL regions for the MCIc group obtained using correlation analysis ( [PITH_FULL_IMAGE:figures/full_fig_p033_22.png] view at source ↗

**Figure 23.** Figure 23: Top: Distribution of anatomical region importance (AAL) according to SHAP values for class 2 (NOR, [PITH_FULL_IMAGE:figures/full_fig_p033_23.png] view at source ↗

read the original abstract

This study introduces a deep learning framework for the inferential exploration of latent representations in 3D brain MRI, leveraging a simple convolutional autoencoder with a hierarchical encoder and a compact latent space. Trained on segmented gray matter images from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the model learns latent representations that preserve neuroanatomical structure and reflect clinical variability across cognitive status. Dimensionality reduction techniques (PCA, t-SNE, PLS, UMAP) were applied to visualize and interpret the latent space, correlating it with anatomical regions defined by the AAL atlas. As a novel contribution, the Latent-Regional Correlation Profiling (LRCP) framework, which combines statistical association and supervised discriminability to identify brain regions that encode clinically relevant latent information is proposed. Our results show that even minimal architectures capture meaningful patterns associated with progression to Alzheimer's disease. Interpretability is assessed by applying SHAP-based regression to a post-hoc model that predicts reconstruction error from atlas-based regional gray matter intensities, thereby identifying anatomically meaningful regions involved in class-specific reconstruction strategies. These findings are further validated using statistical agnostic methods, highlighting the importance of rigorous evaluation in neuroimaging. This work demonstrates the potential of autoencoders as exploratory tools for biomarker discovery and hypothesis generation in clinical neuroscience.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces LRCP as a post-hoc tool to link autoencoder latents to AAL regions in ADNI data, but the central claim that these capture clinical progression rests on missing quantitative evidence and no apparent controls for architectural bias.

read the letter

The one thing to know is that this is an exploratory application of a basic convolutional autoencoder to gray-matter volumes from ADNI, with LRCP offered as the new piece for profiling regional correlations and discriminability. It also runs SHAP on reconstruction error to flag anatomy tied to class-specific reconstruction. That combination is the actual addition over standard autoencoder-plus-visualization pipelines already common in the field. The work does a clean job laying out the pipeline, applying PCA/t-SNE/PLS/UMAP, and showing that even a compact latent space preserves some structure aligned with cognitive status. The SHAP step is a reasonable way to move beyond pure visualization. Those parts are straightforward and could be useful for groups already running similar models. The soft spots are the absence of any numbers in the abstract—no reconstruction errors, no correlation strengths, no classification metrics, no sample breakdowns—and the lack of any described null model, permutation test, or held-out clinical validation. Without those, the LRCP regions could easily trace back to the network's convolutional biases or the atlas parcellation itself rather than Alzheimer's-related encoding. The stress-test concern holds on the information given. This is for readers inside neuroimaging who want concrete examples of interpretability add-ons on real data. It is not positioned as a methodological breakthrough or a clinical tool. A serious referee could check whether the full methods section supplies the missing controls and quantitative results; if they do, the paper earns a review. If the full text stays at the level of the abstract, it does not.

Referee Report

3 major / 1 minor

Summary. The manuscript introduces a convolutional autoencoder with hierarchical encoder and compact latent space, trained on segmented gray-matter volumes from the ADNI dataset. It applies PCA, t-SNE, PLS and UMAP to the latent representations, correlates them with AAL atlas regions, and proposes the LRCP framework that combines statistical association with supervised discriminability to identify brain regions encoding clinically relevant latent information. SHAP-based regression on a post-hoc model predicting reconstruction error from regional intensities is used for interpretability, with the central claim that even minimal architectures capture meaningful patterns associated with Alzheimer's progression.

Significance. If the LRCP-identified regions and SHAP attributions can be shown to reflect clinical signal rather than reconstruction biases, the work would supply a concrete, reproducible pipeline for using autoencoders as hypothesis-generation tools in clinical neuroimaging and would underscore the value of post-hoc validation methods.

major comments (3)

[Abstract and §3] Abstract and §3 (LRCP framework): the claim that LRCP identifies regions that 'encode clinically relevant latent information' rests on correlations between latent dimensions and AAL parcels, yet the description supplies no label-permutation tests, null-model controls, or held-out clinical validation that would isolate the Alzheimer's signal from the convolutional inductive biases and atlas parcellation itself.
[Abstract and §4] Abstract and §4 (SHAP validation): the post-hoc regression predicts reconstruction error from atlas-based regional intensities; without an explicit control that removes clinical-status information (e.g., label permutation or matched reconstruction-error nulls), the resulting SHAP attributions cannot be guaranteed to reflect class-specific clinical encoding rather than architecture-driven reconstruction strategies.
[Results] Results section: the abstract asserts that 'even minimal architectures capture meaningful patterns' but reports no quantitative metrics (R², AUC, p-values, or cross-validation statistics) for either the LRCP correlations or the SHAP attributions, leaving the central empirical claim without numerical support.

minor comments (1)

[Abstract] The phrase 'statistical agnostic methods' in the abstract is unclear; a more precise term such as 'non-parametric statistical tests' or 'distribution-free validation' would improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate where revisions have been made to incorporate additional controls and metrics.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (LRCP framework): the claim that LRCP identifies regions that 'encode clinically relevant latent information' rests on correlations between latent dimensions and AAL parcels, yet the description supplies no label-permutation tests, null-model controls, or held-out clinical validation that would isolate the Alzheimer's signal from the convolutional inductive biases and atlas parcellation itself.

Authors: We agree that explicit null-model controls strengthen the interpretation. The LRCP framework reports Pearson correlations with associated p-values and incorporates supervised discriminability via PLS regression and classification performance on clinical labels. In the revised manuscript we have added label-permutation tests (1000 permutations) that compare observed correlations against those obtained after randomly shuffling clinical status labels, thereby quantifying the extent to which the identified associations exceed what would be expected from architectural or parcellation biases alone. revision: yes
Referee: [Abstract and §4] Abstract and §4 (SHAP validation): the post-hoc regression predicts reconstruction error from atlas-based regional intensities; without an explicit control that removes clinical-status information (e.g., label permutation or matched reconstruction-error nulls), the resulting SHAP attributions cannot be guaranteed to reflect class-specific clinical encoding rather than architecture-driven reconstruction strategies.

Authors: The SHAP analysis is performed on a regression model whose target is reconstruction error, and the manuscript already notes that attributions are interpreted in the context of class-specific reconstruction strategies. To directly address the concern, the revised version includes a label-permutation control: clinical labels are shuffled, the post-hoc regression is retrained, and SHAP values are recomputed; the original attributions are then compared against this null distribution to demonstrate that they are significantly altered when clinical information is removed. revision: yes
Referee: [Results] Results section: the abstract asserts that 'even minimal architectures capture meaningful patterns' but reports no quantitative metrics (R², AUC, p-values, or cross-validation statistics) for either the LRCP correlations or the SHAP attributions, leaving the central empirical claim without numerical support.

Authors: The original manuscript reports p-values for the LRCP correlations and classification accuracies for latent-space discriminability. We acknowledge, however, that R² for the SHAP regression and explicit cross-validation statistics were not presented. The revised results section now includes R² values for the post-hoc regression, AUC scores for the supervised discriminability components, and details of the cross-validation scheme used throughout the pipeline. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper describes an empirical workflow: training a convolutional autoencoder on gray-matter volumes to minimize reconstruction error, followed by post-hoc application of dimensionality reduction (PCA/t-SNE/PLS/UMAP), LRCP correlation with AAL parcels, and SHAP analysis on a separate regression model. No equations, uniqueness theorems, or self-citations are invoked that reduce any claimed result to a fitted parameter or input by construction. All reported associations are presented as data-driven observations rather than algebraic identities or renamed fits. The central claims therefore remain independent of the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that the autoencoder latent space is a faithful, unbiased compression of the input anatomy.

pith-pipeline@v0.9.0 · 5789 in / 1111 out tokens · 35262 ms · 2026-05-25T07:53:09.300572+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 1 internal anchor

[1]

Hofmann et al., ”The utility of explainable AI for MRI analysis: Relating model predictions to neuroimaging features of the aging brain,”bioRxiv, 2024

S.M. Hofmann et al., ”The utility of explainable AI for MRI analysis: Relating model predictions to neuroimaging features of the aging brain,”bioRxiv, 2024

work page 2024
[2]

Studying the manifold structure of Alzheimer’s disease: a deep learning approach using convolutional autoencoders

FJ Martinez-Murcia, et al.. Studying the manifold structure of Alzheimer’s disease: a deep learning approach using convolutional autoencoders. IEEE journal of biomedical and health informatics 24 (1), 17-26

work page
[3]

DRIT++: Diverse image-to-image translation via disentangled representations,

H.-Y . Lee et al., “DRIT++: Diverse image-to-image translation via disentangled representations,” 2019, arXiv:1905.01270

work page arXiv 2019
[4]

Zeineldin et al., ”Explainable hybrid vision transformers and convolutional network for multimodal glioma segmentation in brain MRI,”Scientific Reports, 2024

R.A. Zeineldin et al., ”Explainable hybrid vision transformers and convolutional network for multimodal glioma segmentation in brain MRI,”Scientific Reports, 2024

work page 2024
[5]

JM Gorriz, et al (2024) Is K-fold cross validation the best model selection method for Machine Learning? arXiv preprint arXiv:2401.16407

work page internal anchor Pith review Pith/arXiv arXiv 2024
[6]

Cluster failure: Inflated false positives for fMRI

A.Eklund, et al. Cluster failure: Inflated false positives for fMRI. Proceedings of the National Academy of Sci- ences Jul 2016, 113 (28) 7900-7905

work page 2016
[7]

Noble, et al

S. Noble, et al. Cluster failure or power failure? Evaluating sensitivity in cluster-level inference. NeuroImage, 209, 116468,2020

work page 2020
[8]

Varoquaux

G. Varoquaux. Cross-validation failure: Small sample sizes lead to large error bars. NeuroImage 180 (2018) 68-77

work page 2018
[9]

Varma S. et al. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics volume 7, Article number: 91 (2006)

work page 2006
[10]

ICAM-Reg: Interpretable classification and regression with feature attribution for mapping neurological phenotypes in individual scans,

C. Bass, M. da Silva, C. Sudre, L. Z. J. Williams, H. S. Sousa, P.-D. Tudosiu, F. Alfaro-Almagro, S. P. Fitzgib- bon, M. F. Glasser, S. M. Smith, and E. C. Robinson, “ICAM-Reg: Interpretable classification and regression with feature attribution for mapping neurological phenotypes in individual scans,”IEEE Transactions on Medical Imaging, 2023

work page 2023
[11]

Alzheimer’s Research & Therapy, vol

Zhang, X., et al.,Longitudinal structural MRI-based deep learning and radiomics features for predicting Alzheimer’s disease progression. Alzheimer’s Research & Therapy, vol. 16, no. 1, 2025. Used 3D-Grad-CAM on a 3D-ResNet model to visualize the most influential voxels contributing to risk predictions in AD

work page 2025
[12]

Nikaido, H

N. Nikaido, H. Tanaka, T. Yamamoto, Y . Fujita, S. Mori, Deep-SHAP: Mapping Multivariate Relationships Between Regional Neuroimaging Biomarkers and Cognition in MCI/AD, NeuroImage, vol. 276, p. 119589, 2024

work page 2024
[13]

Eitel, K

F. Eitel, K. Ritter, et al., Testing the robustness of attribution methods for convolutional neural networks in MRI-based Alzheimer’s disease classification, arXiv preprint arXiv:1909.08856, 2019

work page arXiv 1909
[14]

Bass, C., et al. (2022). ICAM-Reg: Interpretable Classification and Regression With Feature Attribution for Mapping Neurological Phenotypes in Individual Scans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022. https://doi.org/10.1109/CVPR52688.2022.01164

work page doi:10.1109/cvpr52688.2022.01164 2022
[15]

Biffi et al., ”Explainable anatomical shape analysis through deep hierarchical generative models,”IEEE Trans- actions on Medical Imaging, 2019

C. Biffi et al., ”Explainable anatomical shape analysis through deep hierarchical generative models,”IEEE Trans- actions on Medical Imaging, 2019

work page 2019
[16]

Gorriz et al

J.M. Gorriz et al. (2025) Autoencoder-based MRI linking latent projections to brain anatomy. IEEE NSS-MIC- RTSD conference, Yokohama. Japan

work page 2025
[17]

Bates, S., et al. (2023). Cross-Validation: What Does It Estimate and How Well Does It Do It? Journal of the American Statistical Association, 1–12

work page 2023
[18]

Gorriz, J.M., et al. (2025). Statistical Agnostic Regression: a machine learning method to validate regression models. Journal of Advanced Research. Advance online publication. https://doi.org/10.1016/j.jare.2025.04.026

work page doi:10.1016/j.jare.2025.04.026 2025
[19]

Image-to-image translation with conditional adversarial networks,

P. Isola, et al., “Image-to-image translation with conditional adversarial networks,” in Proc. IEEE Conf. Com- put. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 1125–1134. Comput.-Assist. Intervent. Cham, Switzerland: Springer, 2020, pp. 315–325

work page 2017
[20]

Unpaired image-to-image translation using cycle-consistent adver- sarial networks,

J.-Y . Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adver- sarial networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2223–2232

work page 2017
[21]

Multimodal unsupervised image-to-image translation,

X. Huang, M.-Y . Liu, S. Belongie, and J. Kautz, “Multimodal unsupervised image-to-image translation,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 172–189. 20 Latent Space Projections and Atlases: A Cautionary Tale in Deep Neuroimaging using AutoencodersA PREPRINT

work page 2018
[22]

Unsupervised image-to-image translation networks,

M.-Y . Liu, T. Breuel, and J. Kautz, “Unsupervised image-to-image translation networks,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 700–708

work page 2017
[23]

Disentangling factors of variation with cycle-consistent variational auto-encoders,

A. H. Jha, S. Anand, M. Singh, and V . Veeravasarapu, “Disentangling factors of variation with cycle-consistent variational auto-encoders,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2018, pp. 829–845

work page 2018
[24]

Visual feature attribution using Wasserstein GANs,

C. F. Baumgartner, L. M. Koch, K. C. Tezcan, J. X. Ang, and E. Konukoglu, “Visual feature attribution using Wasserstein GANs,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 8309–8319

work page 2018
[25]

Image synthesis with a convolutional capsule generative adversarial network,

C. Bass et al., “Image synthesis with a convolutional capsule generative adversarial network,” in Proc. Int. Conf. Med. Imag. Deep Learn., 2019, pp. 1–24

work page 2019
[26]

Deep autoencoding models for unsupervised anomaly segmentation in brain MR images,

C. Baur, B. Wiestler, S. Albarqouni, and N. Navab, “Deep autoencoding models for unsupervised anomaly segmentation in brain MR images,” in Proc. Int. MICCAI Brainlesion Workshop. Cham, Switzerland: Springer, 2018, pp. 161–169

work page 2018
[27]

End-to-end adversarial retinal image synthesis,

P. Costa et al., “End-to-end adversarial retinal image synthesis,” IEEE Trans. Med. Imag., vol. 37, no. 3, pp. 781–791, Mar. 2017

work page 2017
[28]

A., at al

Poldrack, R. A., at al. (2020). Establishment of best practices for evidence for prediction: A review. JAMA Psychiatry, 77(5), 534–540

work page 2020
[29]

Snoek, L., et al. (2019). How to control for confounds in decoding analyses of neuroimaging data. NeuroImage, 184, 741–760

work page 2019
[30]

G ¨orgen, K.et al. (2018). The same analysis approach: Practical protection against the pitfalls of novel neuroimag- ing analysis methods. NeuroImage, 180, 19–30. https://doi.org/10.1016/j.neuroimage.2017.12.083

work page doi:10.1016/j.neuroimage.2017.12.083 2018
[31]

R. M. Cichy et al (2019). ”Deep neural networks as scientific models,” Trends in Cognitive Sciences, vol. 23, no. 4, pp. 305–317

work page 2019
[32]

R. M. Cichy, et al. Comparison of deep neural networks to spatio-temporal cortical dynamics of human vi- sual object recognition reveals hierarchical correspondence. Scientific Reports, vol. 6, p. 27755, 2016. doi: 10.1038/srep27755

work page doi:10.1038/srep27755 2016
[33]

Chatterjee et al., ”TorchEsegeta: Framework for Interpretability and Explainability of Image-based DL Mod- els,”Applied Sciences, 2021

S. Chatterjee et al., ”TorchEsegeta: Framework for Interpretability and Explainability of Image-based DL Mod- els,”Applied Sciences, 2021

work page 2021
[34]

Hinton et al., ”Reducing the dimensionality of data with NN,” Science 313(5786):504-7 2006

G.E. Hinton et al., ”Reducing the dimensionality of data with NN,” Science 313(5786):504-7 2006

work page 2006
[35]

Tzourio-Mazoyer N., et al. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage, 15(1):273–289. doi:10.1006/nimg.2001.0978

work page doi:10.1006/nimg.2001.0978 2002
[36]

CAT – A Computational Anatomy Toolbox for the Analysis of Structural MRI Data

Gaser, C., et al (2016). CAT – A Computational Anatomy Toolbox for the Analysis of Structural MRI Data. Hbm. doi:10.7490/f1000research.111.1603.1

work page doi:10.7490/f1000research.111.1603.1 2016
[37]

D., et al

Penny, W. D., et al. (2011). Statistical Parametric Mapping: The Analysis of Functional Brain Images. Academic Press

work page 2011
[38]

Boucheron et al

S. Boucheron et al. Concentration Inequalities: A Nonasymptotic Theory of Independence ISBN: 9780199535255 Oxford University Press

work page
[39]

van der Maaten et al., ”Visualizing data using t-SNE,” Journal of Machine Learning Research 9 (2008) 2579- 2605

L. van der Maaten et al., ”Visualizing data using t-SNE,” Journal of Machine Learning Research 9 (2008) 2579- 2605

work page 2008
[40]

McInnes, J

L. McInnes, J. Healy, and J. Melville, ”UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,”arXiv

work page
[41]

Frisoni, G. et al. (2010). The clinical use of structural MRI in Alzheimer disease. Nature Reviews Neurology, 6(2), 67-77

work page 2010
[42]

L., et al

Whitwell, J. L., et al. (2007). Patterns of atrophy differ among specific subtypes of mild cognitive impairment. Archives of Neurology, 64(8), 1130-1138

work page 2007
[43]

Li, X., et al. (2022). Altered functional connectivity of Heschl’s gyrus in Alzheimer’s disease and mild cognitive impairment. Frontiers in Aging Neuroscience, 14, 823456

work page 2022
[44]

Braak and E

H. Braak and E. Braak, Neuropathological staging of Alzheimer-related changes, Acta Neuropathologica, vol. 82, no. 4, pp. 239–259, 1991

work page 1991
[45]

Tondelli et al., Structural MRI changes detectable before mild cognitive impairment in the familial Alzheimer’s disease mutation carriers, Neurobiology of Aging, vol

M. Tondelli et al., Structural MRI changes detectable before mild cognitive impairment in the familial Alzheimer’s disease mutation carriers, Neurobiology of Aging, vol. 33, no. 10, pp. 2556–2566, 2012. 21 Latent Space Projections and Atlases: A Cautionary Tale in Deep Neuroimaging using AutoencodersA PREPRINT

work page 2012
[46]

Antonelli et al., Caudate nucleus volume and cognitive dysfunction in Alzheimer’s disease, Neurobiology of Aging, vol

A. Antonelli et al., Caudate nucleus volume and cognitive dysfunction in Alzheimer’s disease, Neurobiology of Aging, vol. 36, no. 10, pp. 2860–2866, 2015

work page 2015
[47]

Hong et al., Putamen atrophy correlates with cognitive decline in Alzheimer’s disease, Journal of Alzheimer’s Disease, vol

S. Hong et al., Putamen atrophy correlates with cognitive decline in Alzheimer’s disease, Journal of Alzheimer’s Disease, vol. 64, no. 4, pp. 1193–1201, 2018

work page 2018
[48]

H. I. L. Jacobs et al., Cerebellar contribution to cognition in Alzheimer’s disease and other dementias, Neuro- science & Biobehavioral Reviews, vol. 90, pp. 234–245, 2018

work page 2018
[49]

Schafer et al., Cerebellar changes in Alzheimer’s disease and dementia with Lewy bodies, Neurobiology of Aging, vol

M. Schafer et al., Cerebellar changes in Alzheimer’s disease and dementia with Lewy bodies, Neurobiology of Aging, vol. 35, no. 6, pp. 1509–1519, 2014

work page 2014
[50]

S. L. Risacher and A. J. Saykin, Longitudinal MRI atrophy patterns in mild cognitive impairment and Alzheimer’s disease, Neurobiology of Aging, vol. 34, no. 12, pp. 2449–2464, 2013

work page 2013
[51]

Tsuchiya et al., Fusiform gyrus volume reduction in Alzheimer’s disease: MRI study, Neuroscience Letters, vol

K. Tsuchiya et al., Fusiform gyrus volume reduction in Alzheimer’s disease: MRI study, Neuroscience Letters, vol. 402, no. 1-2, pp. 105–110, 2006. 22 Latent Space Projections and Atlases: A Cautionary Tale in Deep Neuroimaging using AutoencodersA PREPRINT Figure 10: Correlation analysis including all comparisons and anatomical AAL regions is shown for the...

work page 2006

[1] [1]

Hofmann et al., ”The utility of explainable AI for MRI analysis: Relating model predictions to neuroimaging features of the aging brain,”bioRxiv, 2024

S.M. Hofmann et al., ”The utility of explainable AI for MRI analysis: Relating model predictions to neuroimaging features of the aging brain,”bioRxiv, 2024

work page 2024

[2] [2]

Studying the manifold structure of Alzheimer’s disease: a deep learning approach using convolutional autoencoders

FJ Martinez-Murcia, et al.. Studying the manifold structure of Alzheimer’s disease: a deep learning approach using convolutional autoencoders. IEEE journal of biomedical and health informatics 24 (1), 17-26

work page

[3] [3]

DRIT++: Diverse image-to-image translation via disentangled representations,

H.-Y . Lee et al., “DRIT++: Diverse image-to-image translation via disentangled representations,” 2019, arXiv:1905.01270

work page arXiv 2019

[4] [4]

Zeineldin et al., ”Explainable hybrid vision transformers and convolutional network for multimodal glioma segmentation in brain MRI,”Scientific Reports, 2024

R.A. Zeineldin et al., ”Explainable hybrid vision transformers and convolutional network for multimodal glioma segmentation in brain MRI,”Scientific Reports, 2024

work page 2024

[5] [5]

JM Gorriz, et al (2024) Is K-fold cross validation the best model selection method for Machine Learning? arXiv preprint arXiv:2401.16407

work page internal anchor Pith review Pith/arXiv arXiv 2024

[6] [6]

Cluster failure: Inflated false positives for fMRI

A.Eklund, et al. Cluster failure: Inflated false positives for fMRI. Proceedings of the National Academy of Sci- ences Jul 2016, 113 (28) 7900-7905

work page 2016

[7] [7]

Noble, et al

S. Noble, et al. Cluster failure or power failure? Evaluating sensitivity in cluster-level inference. NeuroImage, 209, 116468,2020

work page 2020

[8] [8]

Varoquaux

G. Varoquaux. Cross-validation failure: Small sample sizes lead to large error bars. NeuroImage 180 (2018) 68-77

work page 2018

[9] [9]

Varma S. et al. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics volume 7, Article number: 91 (2006)

work page 2006

[10] [10]

ICAM-Reg: Interpretable classification and regression with feature attribution for mapping neurological phenotypes in individual scans,

C. Bass, M. da Silva, C. Sudre, L. Z. J. Williams, H. S. Sousa, P.-D. Tudosiu, F. Alfaro-Almagro, S. P. Fitzgib- bon, M. F. Glasser, S. M. Smith, and E. C. Robinson, “ICAM-Reg: Interpretable classification and regression with feature attribution for mapping neurological phenotypes in individual scans,”IEEE Transactions on Medical Imaging, 2023

work page 2023

[11] [11]

Alzheimer’s Research & Therapy, vol

Zhang, X., et al.,Longitudinal structural MRI-based deep learning and radiomics features for predicting Alzheimer’s disease progression. Alzheimer’s Research & Therapy, vol. 16, no. 1, 2025. Used 3D-Grad-CAM on a 3D-ResNet model to visualize the most influential voxels contributing to risk predictions in AD

work page 2025

[12] [12]

Nikaido, H

N. Nikaido, H. Tanaka, T. Yamamoto, Y . Fujita, S. Mori, Deep-SHAP: Mapping Multivariate Relationships Between Regional Neuroimaging Biomarkers and Cognition in MCI/AD, NeuroImage, vol. 276, p. 119589, 2024

work page 2024

[13] [13]

Eitel, K

F. Eitel, K. Ritter, et al., Testing the robustness of attribution methods for convolutional neural networks in MRI-based Alzheimer’s disease classification, arXiv preprint arXiv:1909.08856, 2019

work page arXiv 1909

[14] [14]

Bass, C., et al. (2022). ICAM-Reg: Interpretable Classification and Regression With Feature Attribution for Mapping Neurological Phenotypes in Individual Scans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022. https://doi.org/10.1109/CVPR52688.2022.01164

work page doi:10.1109/cvpr52688.2022.01164 2022

[15] [15]

Biffi et al., ”Explainable anatomical shape analysis through deep hierarchical generative models,”IEEE Trans- actions on Medical Imaging, 2019

C. Biffi et al., ”Explainable anatomical shape analysis through deep hierarchical generative models,”IEEE Trans- actions on Medical Imaging, 2019

work page 2019

[16] [16]

Gorriz et al

J.M. Gorriz et al. (2025) Autoencoder-based MRI linking latent projections to brain anatomy. IEEE NSS-MIC- RTSD conference, Yokohama. Japan

work page 2025

[17] [17]

Bates, S., et al. (2023). Cross-Validation: What Does It Estimate and How Well Does It Do It? Journal of the American Statistical Association, 1–12

work page 2023

[18] [18]

Gorriz, J.M., et al. (2025). Statistical Agnostic Regression: a machine learning method to validate regression models. Journal of Advanced Research. Advance online publication. https://doi.org/10.1016/j.jare.2025.04.026

work page doi:10.1016/j.jare.2025.04.026 2025

[19] [19]

Image-to-image translation with conditional adversarial networks,

P. Isola, et al., “Image-to-image translation with conditional adversarial networks,” in Proc. IEEE Conf. Com- put. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 1125–1134. Comput.-Assist. Intervent. Cham, Switzerland: Springer, 2020, pp. 315–325

work page 2017

[20] [20]

Unpaired image-to-image translation using cycle-consistent adver- sarial networks,

J.-Y . Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adver- sarial networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2223–2232

work page 2017

[21] [21]

Multimodal unsupervised image-to-image translation,

X. Huang, M.-Y . Liu, S. Belongie, and J. Kautz, “Multimodal unsupervised image-to-image translation,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 172–189. 20 Latent Space Projections and Atlases: A Cautionary Tale in Deep Neuroimaging using AutoencodersA PREPRINT

work page 2018

[22] [22]

Unsupervised image-to-image translation networks,

M.-Y . Liu, T. Breuel, and J. Kautz, “Unsupervised image-to-image translation networks,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 700–708

work page 2017

[23] [23]

Disentangling factors of variation with cycle-consistent variational auto-encoders,

A. H. Jha, S. Anand, M. Singh, and V . Veeravasarapu, “Disentangling factors of variation with cycle-consistent variational auto-encoders,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2018, pp. 829–845

work page 2018

[24] [24]

Visual feature attribution using Wasserstein GANs,

C. F. Baumgartner, L. M. Koch, K. C. Tezcan, J. X. Ang, and E. Konukoglu, “Visual feature attribution using Wasserstein GANs,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 8309–8319

work page 2018

[25] [25]

Image synthesis with a convolutional capsule generative adversarial network,

C. Bass et al., “Image synthesis with a convolutional capsule generative adversarial network,” in Proc. Int. Conf. Med. Imag. Deep Learn., 2019, pp. 1–24

work page 2019

[26] [26]

Deep autoencoding models for unsupervised anomaly segmentation in brain MR images,

C. Baur, B. Wiestler, S. Albarqouni, and N. Navab, “Deep autoencoding models for unsupervised anomaly segmentation in brain MR images,” in Proc. Int. MICCAI Brainlesion Workshop. Cham, Switzerland: Springer, 2018, pp. 161–169

work page 2018

[27] [27]

End-to-end adversarial retinal image synthesis,

P. Costa et al., “End-to-end adversarial retinal image synthesis,” IEEE Trans. Med. Imag., vol. 37, no. 3, pp. 781–791, Mar. 2017

work page 2017

[28] [28]

A., at al

Poldrack, R. A., at al. (2020). Establishment of best practices for evidence for prediction: A review. JAMA Psychiatry, 77(5), 534–540

work page 2020

[29] [29]

Snoek, L., et al. (2019). How to control for confounds in decoding analyses of neuroimaging data. NeuroImage, 184, 741–760

work page 2019

[30] [30]

G ¨orgen, K.et al. (2018). The same analysis approach: Practical protection against the pitfalls of novel neuroimag- ing analysis methods. NeuroImage, 180, 19–30. https://doi.org/10.1016/j.neuroimage.2017.12.083

work page doi:10.1016/j.neuroimage.2017.12.083 2018

[31] [31]

R. M. Cichy et al (2019). ”Deep neural networks as scientific models,” Trends in Cognitive Sciences, vol. 23, no. 4, pp. 305–317

work page 2019

[32] [32]

R. M. Cichy, et al. Comparison of deep neural networks to spatio-temporal cortical dynamics of human vi- sual object recognition reveals hierarchical correspondence. Scientific Reports, vol. 6, p. 27755, 2016. doi: 10.1038/srep27755

work page doi:10.1038/srep27755 2016

[33] [33]

Chatterjee et al., ”TorchEsegeta: Framework for Interpretability and Explainability of Image-based DL Mod- els,”Applied Sciences, 2021

S. Chatterjee et al., ”TorchEsegeta: Framework for Interpretability and Explainability of Image-based DL Mod- els,”Applied Sciences, 2021

work page 2021

[34] [34]

Hinton et al., ”Reducing the dimensionality of data with NN,” Science 313(5786):504-7 2006

G.E. Hinton et al., ”Reducing the dimensionality of data with NN,” Science 313(5786):504-7 2006

work page 2006

[35] [35]

Tzourio-Mazoyer N., et al. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage, 15(1):273–289. doi:10.1006/nimg.2001.0978

work page doi:10.1006/nimg.2001.0978 2002

[36] [36]

CAT – A Computational Anatomy Toolbox for the Analysis of Structural MRI Data

Gaser, C., et al (2016). CAT – A Computational Anatomy Toolbox for the Analysis of Structural MRI Data. Hbm. doi:10.7490/f1000research.111.1603.1

work page doi:10.7490/f1000research.111.1603.1 2016

[37] [37]

D., et al

Penny, W. D., et al. (2011). Statistical Parametric Mapping: The Analysis of Functional Brain Images. Academic Press

work page 2011

[38] [38]

Boucheron et al

S. Boucheron et al. Concentration Inequalities: A Nonasymptotic Theory of Independence ISBN: 9780199535255 Oxford University Press

work page

[39] [39]

van der Maaten et al., ”Visualizing data using t-SNE,” Journal of Machine Learning Research 9 (2008) 2579- 2605

L. van der Maaten et al., ”Visualizing data using t-SNE,” Journal of Machine Learning Research 9 (2008) 2579- 2605

work page 2008

[40] [40]

McInnes, J

L. McInnes, J. Healy, and J. Melville, ”UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,”arXiv

work page

[41] [41]

Frisoni, G. et al. (2010). The clinical use of structural MRI in Alzheimer disease. Nature Reviews Neurology, 6(2), 67-77

work page 2010

[42] [42]

L., et al

Whitwell, J. L., et al. (2007). Patterns of atrophy differ among specific subtypes of mild cognitive impairment. Archives of Neurology, 64(8), 1130-1138

work page 2007

[43] [43]

Li, X., et al. (2022). Altered functional connectivity of Heschl’s gyrus in Alzheimer’s disease and mild cognitive impairment. Frontiers in Aging Neuroscience, 14, 823456

work page 2022

[44] [44]

Braak and E

H. Braak and E. Braak, Neuropathological staging of Alzheimer-related changes, Acta Neuropathologica, vol. 82, no. 4, pp. 239–259, 1991

work page 1991

[45] [45]

Tondelli et al., Structural MRI changes detectable before mild cognitive impairment in the familial Alzheimer’s disease mutation carriers, Neurobiology of Aging, vol

M. Tondelli et al., Structural MRI changes detectable before mild cognitive impairment in the familial Alzheimer’s disease mutation carriers, Neurobiology of Aging, vol. 33, no. 10, pp. 2556–2566, 2012. 21 Latent Space Projections and Atlases: A Cautionary Tale in Deep Neuroimaging using AutoencodersA PREPRINT

work page 2012

[46] [46]

Antonelli et al., Caudate nucleus volume and cognitive dysfunction in Alzheimer’s disease, Neurobiology of Aging, vol

A. Antonelli et al., Caudate nucleus volume and cognitive dysfunction in Alzheimer’s disease, Neurobiology of Aging, vol. 36, no. 10, pp. 2860–2866, 2015

work page 2015

[47] [47]

Hong et al., Putamen atrophy correlates with cognitive decline in Alzheimer’s disease, Journal of Alzheimer’s Disease, vol

S. Hong et al., Putamen atrophy correlates with cognitive decline in Alzheimer’s disease, Journal of Alzheimer’s Disease, vol. 64, no. 4, pp. 1193–1201, 2018

work page 2018

[48] [48]

H. I. L. Jacobs et al., Cerebellar contribution to cognition in Alzheimer’s disease and other dementias, Neuro- science & Biobehavioral Reviews, vol. 90, pp. 234–245, 2018

work page 2018

[49] [49]

Schafer et al., Cerebellar changes in Alzheimer’s disease and dementia with Lewy bodies, Neurobiology of Aging, vol

M. Schafer et al., Cerebellar changes in Alzheimer’s disease and dementia with Lewy bodies, Neurobiology of Aging, vol. 35, no. 6, pp. 1509–1519, 2014

work page 2014

[50] [50]

S. L. Risacher and A. J. Saykin, Longitudinal MRI atrophy patterns in mild cognitive impairment and Alzheimer’s disease, Neurobiology of Aging, vol. 34, no. 12, pp. 2449–2464, 2013

work page 2013

[51] [51]

Tsuchiya et al., Fusiform gyrus volume reduction in Alzheimer’s disease: MRI study, Neuroscience Letters, vol

K. Tsuchiya et al., Fusiform gyrus volume reduction in Alzheimer’s disease: MRI study, Neuroscience Letters, vol. 402, no. 1-2, pp. 105–110, 2006. 22 Latent Space Projections and Atlases: A Cautionary Tale in Deep Neuroimaging using AutoencodersA PREPRINT Figure 10: Correlation analysis including all comparisons and anatomical AAL regions is shown for the...

work page 2006