pith. machine review for the scientific record. sign in

arxiv: 2605.13933 · v1 · submitted 2026-05-13 · 💻 cs.LG · cs.AI· stat.ML

Recognition: no theorem link

Unsupervised learning of acquisition variability in structural connectomes via hybrid latent space modeling

Authors on Pith no claims yet

Pith reviewed 2026-05-15 04:59 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML
keywords unsupervised learningstructural connectomesdMRIhybrid latent spaceacquisition variabilityvariational autoencodersite effectsconnectome analysis
0
0 comments X

The pith

Hybrid latent space modeling with architectural annealing separates acquisition variability from biological signals in structural connectomes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an unsupervised model to represent high-dimensional structural connectomes from dMRI while isolating technical differences due to scanners, sites, and protocols from biological factors such as age and disease. It combines continuous latent variables for gradual changes with discrete ones for categorical effects, using architectural annealing on encoder outputs to let the model adaptively balance the two during training instead of requiring manual adjustments. Evaluation on 7,416 connectomes spanning 13 studies and 25 acquisition combinations shows the approach yields site clusters with an adjusted Rand index of 0.53, outperforming standard VAEs, PCA with k-means, and loss-based hybrid models. If the separation holds, joint modeling of smooth and categorical structure recovers clusters aligned with scanner and protocol differences rather than absorbing them into continuous variance.

Core claim

A hybrid continuous-discrete latent space, with architectural rather than loss-based annealing, provides an unsupervised mechanism for capturing acquisition variability in dMRI-derived structural connectomes by jointly modeling smooth and categorical structure, recovering clusters aligned with scanner and protocol differences.

What carries the argument

Architectural annealing of encoder outputs before decoding in a hybrid continuous-discrete latent space.

If this is right

  • The model adaptively balances discrete and continuous components without manual capacity tuning during training.
  • Acquisition effects are recovered as discrete clusters rather than being absorbed into the continuous latent space.
  • Joint modeling of smooth biological variation and categorical technical effects improves unsupervised separation on multi-site data.
  • Stronger site learning (ARI=0.53) than standard VAE, PCA+k-means, or loss-annealed hybrids demonstrates the mechanism on data spanning ages 2-102 and three diagnostic groups.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same annealing approach could be applied to normalize connectome features before downstream biomarker models to reduce scanner-related noise.
  • If the discrete component stays stable on new scanners, the framework might support incremental addition of sites without full retraining.
  • Comparing discrete cluster stability across age ranges could test whether acquisition isolation remains clean when biological heterogeneity increases.

Load-bearing premise

The discrete latent component specifically isolates acquisition variability rather than other unmeasured categorical factors such as age groups or disease status, and the architectural annealing reliably prevents the discrete capacity from collapsing or absorbing continuous variance.

What would settle it

If the discrete latent clusters on the 7,416-connectome dataset show no stronger alignment with the 25 known acquisition-parameter combinations than with disease status groups or random assignment, the claim that architectural annealing isolates acquisition variability would not hold.

Figures

Figures reproduced from arXiv: 2605.13933 by Adam M. Saunders, Aravind R. Krishnan, Bennett A. Landman, Chenyu Gao, Daniel C. Moyer, Derek Archer, Elyssa McMaster, Gaurav Rudravaram, Jongyeon Yoon, Karthik Ramadass, Laura A. Barquero, Laurie B. Cutting, Lianrui Zuo, Lori L. Beason Held, Micah DArchangel, Murat Bilgel, Nancy R. Newlin, Praitayini Kanakaraj, Timothy J. Hohman, Tin Q. Nguyen.

Figure 1
Figure 1. Figure 1: We encode each flattened connectome X into a mean, log-variance, and discrete class probabilities. An annealing factor (ramping from 0 to 1) scales the mean and log-variance during training to suppress the continuous pathway early on and encourage reliance on the discrete space before transitioning to full joint optimiza￾tion. The continuous latent variable zc is sampled via the VAE reparameterization tric… view at source ↗
Figure 2
Figure 2. Figure 2: Latent space of the Joint-VAE (colored by ground-truth sites and learned discrete classes). As continuous capacity increases, the discrete space collapses, merging distinct site clusters and illustrating the sensitivity of the original Joint-VAE to manual capacity tuning, motivating the need for a principled, automatic mecha￾nism to balance continuous and discrete representations. 3.2. Site learning compar… view at source ↗
Figure 3
Figure 3. Figure 3: A) ARI comparison across methods as the number of discrete classes varies. Joint-VAE with model-based annealing and Joint-VAE with loss-based anneal￾ing both outperform PCA+k-means and VAE+k-means as the number of classes approaches the true number of acquisition sites (25), with the model-annealed Joint-VAE showing the most stable and consistently high ARI (p < 0.05; 1000 bootstrap resamples). B) Sensitiv… view at source ↗
Figure 4
Figure 4. Figure 4: Left: Data points colored by true acquisition site and model-learned discrete as￾signments, showing strong correspondence. Right: the same space colored by acquisition parameters using the LAB scientific color space (L = TE, A = TR, B = #directions). Marker size indicates shell value for single-shell data, and sym￾bol type indicates shell combinations for multi-shell data. The latent structure captures mul… view at source ↗
read the original abstract

Acquisition differences across sites, scanners, and protocols in dMRI introduce variability that complicates structural connectome analysis. This motivates deep learning models that can represent high-dimensional connectomes in a low-dimensional space while explicitly separating acquisition-related effects from biological variation. Conventional dimensionality reduction methods model all variance as continuous, so acquisition effects often get absorbed into a continuous latent space. Recent hybrid latent-space models combine discrete and continuous components to address this, but typically require manual capacity tuning to ensure the discrete component captures the intended variability. We introduce an unsupervised framework that removes this manual tuning by architecturally annealing encoder outputs before decoding, allowing the model to adaptively balance discrete and continuous latent variables during training. To evaluate it, we curated a dataset of N=7,416 structural connectomes derived from dMRI, spanning ages 2 to 102 and 13 studies with 25 unique acquisition-parameter combinations. Of these, 5,900 are cognitively unimpaired, 877 have mild cognitive impairment (MCI), and 639 have Alzheimer's disease (AD). We compare against a standard VAE, PCA with k-means clustering, and hybrid models that anneal only through the loss function. Our architectural annealing produces stronger site learning (ARI=0.53, p<0.05) than these baselines. Results show that a hybrid continuous-discrete latent space, with architectural rather than loss-based annealing, provides a useful unsupervised mechanism for capturing acquisition variability in dMRI: by jointly modeling smooth and categorical structure, the Joint-VAE recovers clusters aligned with scanner and protocol differences.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces a hybrid VAE variant with architectural annealing of encoder outputs to adaptively balance discrete and continuous latent variables, claiming this unsupervised approach isolates acquisition variability (scanner/protocol effects) in structural connectomes better than standard VAEs, PCA+k-means, or loss-based annealing hybrids, with an ARI of 0.53 (p<0.05) against site labels on a cohort of 7,416 connectomes spanning 13 studies and 25 acquisition combinations.

Significance. If the discrete component can be shown to specifically capture acquisition effects independently of biological covariates, the method would provide a useful tool for unsupervised harmonization of multi-site dMRI data without requiring manual capacity tuning or supervised labels.

major comments (2)
  1. [Abstract] Abstract and results: the reported ARI=0.53 alignment with site labels does not include any analysis (e.g., conditional mutual information or post-hoc correlation) demonstrating that the discrete assignments remain orthogonal to disease status (AD/MCI/unimpaired) and age (2–102 years), both of which are strong categorical factors in the cohort that could be captured by the same discrete latent.
  2. [Methods] Methods/results: no information is given on validation splits, sensitivity of the ARI to the architectural annealing schedule hyperparameters, or controls for potential confounds between acquisition parameters and clinical labels, which are required to support the claim that the improvement is due to specific isolation of acquisition variability.
minor comments (1)
  1. [Abstract] Abstract: the description of the 25 acquisition-parameter combinations would be clearer if the exact parameters (e.g., b-values, directions) were summarized in a table.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and have revised the manuscript accordingly to strengthen the validation of our claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract and results: the reported ARI=0.53 alignment with site labels does not include any analysis (e.g., conditional mutual information or post-hoc correlation) demonstrating that the discrete assignments remain orthogonal to disease status (AD/MCI/unimpaired) and age (2–102 years), both of which are strong categorical factors in the cohort that could be captured by the same discrete latent.

    Authors: We agree that demonstrating the discrete assignments are largely orthogonal to biological covariates is necessary to support the interpretation that they isolate acquisition variability. The original submission did not include these checks. In the revised manuscript we add conditional mutual information analysis between the discrete cluster assignments and disease status (conditioned on site) as well as within-cluster age correlations. These new results are reported in an expanded Results section and referenced in the abstract. revision: yes

  2. Referee: [Methods] Methods/results: no information is given on validation splits, sensitivity of the ARI to the architectural annealing schedule hyperparameters, or controls for potential confounds between acquisition parameters and clinical labels, which are required to support the claim that the improvement is due to specific isolation of acquisition variability.

    Authors: We acknowledge these omissions in the original Methods and Results. The revised manuscript now specifies the train/validation/test splits (stratified 80/10/10 by study), reports a sensitivity analysis over annealing schedule hyperparameters (ARI remains stable across the tested range), and includes confound controls by recomputing ARI within each clinical subgroup. These additions are placed in the Methods and Results sections to substantiate that performance gains arise from acquisition isolation. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation and evaluation are self-contained

full rationale

The paper introduces a hybrid VAE variant with architectural annealing of encoder outputs to balance discrete and continuous latents. This architecture is defined directly in the methods without reducing to a self-definition or renaming of prior results. The central evaluation computes ARI against external site labels from the curated N=7416 dataset, which is an independent post-training metric and does not reduce by construction to any fitted parameter or loss term in the model's equations. Comparisons to baselines (standard VAE, PCA+k-means, loss-based annealing) are performed on the same external labels without circular reduction. No load-bearing self-citation chains, uniqueness theorems imported from authors, or ansatz smuggling appear in the derivation. The result is a standard empirical modeling paper whose claims rest on external validation rather than internal equivalence.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on standard VAE assumptions plus the domain assumption that acquisition effects are primarily categorical while biological variation is continuous; no new free parameters or invented entities are introduced beyond typical VAE hyperparameters.

free parameters (1)
  • annealing schedule hyperparameters
    The architectural annealing uses parameters that control the rate and form of encoder output modification; these are not detailed in the abstract.
axioms (1)
  • domain assumption Acquisition variability in dMRI connectomes can be captured by discrete latent variables while biological signals remain in continuous variables.
    Invoked in the motivation and evaluation sections of the abstract.

pith-pipeline@v0.9.0 · 5688 in / 1303 out tokens · 25005 ms · 2026-05-15T04:59:43.303927+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 1 internal anchor

  1. [1]

    1986 , isbn =

    Lamport, Leslie , title =. 1986 , isbn =

  2. [2]

    Distilling the knowledge in a neural network , author=

  3. [3]

    Advances in neural information processing systems , volume=

    Learning disentangled joint continuous and discrete representations , author=. Advances in neural information processing systems , volume=

  4. [4]

    Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine , volume=

    Effects of echo time on diffusion quantification of brain white matter at 1.5 T and 3.0 T , author=. Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine , volume=. 2009 , publisher=

  5. [5]

    European journal of radiology , volume=

    Imaging parameter effects in apparent diffusion coefficient determination of magnetic resonance imaging , author=. European journal of radiology , volume=. 2011 , publisher=

  6. [6]

    Neuroimage , volume=

    B-value dependence of DTI quantitation and sensitivity in detecting neural tissue changes , author=. Neuroimage , volume=. 2010 , publisher=

  7. [7]

    Human Brain Mapping , volume=

    Both noise-floor and tissue compartment difference in diffusivity contribute to FA dependence on b-value in diffusion MRI , author=. Human Brain Mapping , volume=. 2023 , publisher=

  8. [8]

    arXiv preprint arXiv:2512.02032 , year=

    Characterizing Continuous and Discrete Hybrid Latent Spaces for Structural Connectomes , author=. arXiv preprint arXiv:2512.02032 , year=

  9. [9]

    NeuroImage: Clinical , volume=

    Connectome analysis with diffusion MRI in idiopathic Parkinson's disease: evaluation using multi-shell, multi-tissue, constrained spherical deconvolution , author=. NeuroImage: Clinical , volume=. 2018 , publisher=

  10. [10]

    Frontiers in physics , volume=

    Design and validation of diffusion MRI models of white matter , author=. Frontiers in physics , volume=. 2017 , publisher=

  11. [11]

    Neuroimage , volume=

    NODDI: practical in vivo neurite orientation dispersion and density imaging of the human brain , author=. Neuroimage , volume=. 2012 , publisher=

  12. [12]

    International Journal of Molecular Sciences , volume=

    Diffusion magnetic resonance imaging-based biomarkers for neurodegenerative diseases , author=. International Journal of Molecular Sciences , volume=. 2021 , publisher=

  13. [13]

    Cerebral cortex , volume=

    Cortical microstructural alterations in mild cognitive impairment and Alzheimer’s disease dementia , author=. Cerebral cortex , volume=. 2020 , publisher=

  14. [14]

    JAMA neurology , volume=

    White matter integrity determined with diffusion tensor imaging in older adults without dementia: influence of amyloid load and neurodegeneration , author=. JAMA neurology , volume=. 2014 , publisher=

  15. [15]

    2010 , publisher=

    Diffusion mri , author=. 2010 , publisher=

  16. [16]

    2013 , publisher=

    Introduction to diffusion tensor imaging: And higher order models , author=. 2013 , publisher=

  17. [17]

    Nature Protocols , volume=

    Tutorial: a guide to diffusion MRI and structural connectomics , author=. Nature Protocols , volume=. 2025 , publisher=

  18. [18]

    Neuroradiology , volume=

    Q-ball imaging models: comparison between high and low angular resolution diffusion-weighted MRI protocols for investigation of brain white matter integrity , author=. Neuroradiology , volume=. 2016 , publisher=

  19. [19]

    Computational Diffusion MRI: MICCAI Workshop, Munich, Germany, October 9th, 2015 , pages=

    Reliability of structural connectivity examined with four different diffusion reconstruction methods at two different spatial and angular resolutions , author=. Computational Diffusion MRI: MICCAI Workshop, Munich, Germany, October 9th, 2015 , pages=. 2016 , organization=

  20. [20]

    Journal of Neuroimaging , volume=

    Reproducibility of the structural connectome reconstruction across diffusion methods , author=. Journal of Neuroimaging , volume=. 2016 , publisher=

  21. [21]

    Human Brain Mapping , volume=

    Big Data, Small Bias: Harmonizing Diffusion MRI-Based Structural Connectomes to Mitigate Site-Related Bias in Data Integration , author=. Human Brain Mapping , volume=. 2025 , publisher=

  22. [22]

    Computer Methods and Programs in Biomedicine , pages=

    ConnectomeAE: Multimodal Brain Connectome-based Dual-Branch Autoencoder and Its Application in the Diagnosis of Brain Diseases , author=. Computer Methods and Programs in Biomedicine , pages=. 2025 , publisher=

  23. [23]

    IEEE journal of biomedical and health informatics , volume=

    Graph autoencoders for embedding learning in brain networks and major depressive disorder identification , author=. IEEE journal of biomedical and health informatics , volume=. 2024 , publisher=

  24. [24]

    Proceedings of SPIE--the International Society for Optical Engineering , volume=

    Evaluation of mean shift, ComBat, and CycleGAN for harmonizing brain connectivity matrices across sites , author=. Proceedings of SPIE--the International Society for Optical Engineering , volume=

  25. [25]

    Journal of Medical Imaging , volume=

    Harmonizing 10,000 connectomes: site-invariant representation learning for multi-site analysis of network connectivity and cognitive impairment , author=. Journal of Medical Imaging , volume=. 2025 , publisher=

  26. [26]

    arXiv preprint arXiv:2507.13992 , year=

    Structural Connectome Harmonization Using Deep Learning: The Strength of Graph Neural Networks , author=. arXiv preprint arXiv:2507.13992 , year=

  27. [27]

    The Journals of Gerontology Series A: Biological Sciences and Medical Sciences , volume=

    The Baltimore Longitudinal Study of Aging (BLSA): a 50-year-long journey and plans for the future , author=. The Journals of Gerontology Series A: Biological Sciences and Medical Sciences , volume=. 2008 , publisher=

  28. [28]

    Alzheimer's & Dementia: Translational Research & Clinical Interventions , volume=

    Health and Aging Brain Study--Health Disparities (HABS-HD) methods and partner characteristics , author=. Alzheimer's & Dementia: Translational Research & Clinical Interventions , volume=. 2025 , publisher=

  29. [29]

    Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring , volume=

    The Wisconsin Registry for Alzheimer's Prevention: a review of findings and current directions , author=. Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring , volume=. 2018 , publisher=

  30. [30]

    Neuroepidemiology , volume=

    The Rush Memory and Aging Project: study design and baseline characteristics of the study cohort , author=. Neuroepidemiology , volume=. 2005 , publisher=

  31. [31]

    Data in brief , volume=

    Calgary Preschool magnetic resonance imaging (MRI) dataset , author=. Data in brief , volume=. 2020 , publisher=

  32. [32]

    Booth AND Christine Brennan AND Ozlem Ece Demir-Lira AND Amy Desroches AND Clara Ekerdt AND Margaret M

    James R. Booth AND Christine Brennan AND Ozlem Ece Demir-Lira AND Amy Desroches AND Clara Ekerdt AND Margaret M. Gullick AND Marisa N. Lytle AND Chris McNorgan AND Melissa Randazzo-Wagner AND Yael Weiss AND Jessica Wise Younger , title =. 2022 , doi =

  33. [33]

    Cai AND Qi Yang AND Praitayini Kanakaraj AND Vishwesh Nath AND Allen T

    Leon Y. Cai AND Qi Yang AND Praitayini Kanakaraj AND Vishwesh Nath AND Allen T. Newton AND Heidi A. Edmonson AND Jeffrey Luci AND Benjamin N. Conrad AND Gavin R. Price AND Colin B. Hansen AND Cailey I. Kerley AND Karthik Ramadass AND Fang-Cheng Yeh AND Hakmook Kang AND Eleftherios Garyfallidis AND Maxime Descoteaux AND Francois Rheault AND Kurt G. Schilli...

  34. [34]

    Lytle AND Yael Weiss AND Brianna L

    Jin Wang AND Marisa N. Lytle AND Yael Weiss AND Brianna L. Yamasaki AND James R. Booth , title =. 2022 , doi =

  35. [35]

    AND Hansell, Narelle K

    Strike, Lachlan T. AND Hansell, Narelle K. AND Miller, Jessica L. AND Chuang, Kai-Hsiang AND Thompson, Paul M. AND de Zubicaray, Greig I. AND McMahon, Katie L. AND Wright, Margaret J. , title =. 2022 , doi =

  36. [36]

    Magnetic resonance in medicine , volume=

    PreQual: An automated pipeline for integrated preprocessing and quality assurance of diffusion weighted MRI images , author=. Magnetic resonance in medicine , volume=. 2021 , publisher=

  37. [37]

    NeuroImage , volume=

    3D whole brain segmentation using spatially localized atlas network tiles , author=. NeuroImage , volume=. 2019 , publisher=

  38. [38]

    International journal of imaging systems and technology , volume=

    MRtrix: diffusion tractography in crossing fiber regions , author=. International journal of imaging systems and technology , volume=. 2012 , publisher=

  39. [39]

    PloS one , volume=

    Scalable quality control on processing of large diffusion-weighted and structural magnetic resonance imaging datasets , author=. PloS one , volume=. 2025 , publisher=

  40. [40]

    Understanding disentangling in $\beta$-VAE

    Understanding disentangling in -VAE , author=. arXiv preprint arXiv:1804.03599 , year=

  41. [41]

    NeuroImage , volume=

    Unsupervised MR harmonization by learning disentangled representations using information bottleneck theory , author=. NeuroImage , volume=. 2021 , publisher=

  42. [42]

    MICCAI workshop on data augmentation, labelling, and imperfections , pages=

    Disentangling a single MR modality , author=. MICCAI workshop on data augmentation, labelling, and imperfections , pages=. 2022 , organization=

  43. [43]

    Computerized Medical Imaging and Graphics , volume=

    HACA3: A unified approach for multi-site MR image harmonization , author=. Computerized Medical Imaging and Graphics , volume=. 2023 , publisher=

  44. [44]

    Neuroimage , volume=

    Harmonization of multi-site diffusion tensor imaging data , author=. Neuroimage , volume=. 2017 , publisher=

  45. [45]

    Current Alzheimer Research , volume=

    Overview and findings from the religious orders study , author=. Current Alzheimer Research , volume=. 2012 , publisher=

  46. [46]

    Current Alzheimer Research , volume=

    The Minority Aging Research Study: ongoing efforts to obtain brain donation in African Americans without dementia , author=. Current Alzheimer Research , volume=. 2012 , publisher=