pith. machine review for the scientific record. sign in

arxiv: 2604.22018 · v1 · submitted 2026-04-23 · 🧬 q-bio.NC · cs.AI· cs.LG· eess.SP

Recognition: unknown

Foundation models for discovering robust biomarkers of neurological disorders from dynamic functional connectivity

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:51 UTC · model grok-4.3

classification 🧬 q-bio.NC cs.AIcs.LGeess.SP
keywords foundation modelsdynamic functional connectivitybiomarkersneurological disordersHub-LoRARE-CONFIRMautism spectrum disorderADHD
0
0 comments X

The pith

Hub-LoRA fine-tuning lets foundation models outperform custom models on brain disorder prediction while yielding biomarkers aligned with known biology.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that common accuracy measures fall short when judging whether deep learning models have found genuine biomarkers in brain scan data. It introduces RE-CONFIRM to test if models properly identify important brain regions called hubs in disorders like autism and ADHD. Tests reveal that ordinary fine-tuning of large foundation models ignores these hubs. The authors then present Hub-LoRA, a specialized adaptation method that improves prediction accuracy and produces biomarkers backed by multiple prior studies. If correct, this could make foundation models more trustworthy for discovering biological markers of neurological conditions.

Core claim

The central discovery is that while foundation models fine-tuned on dynamic functional connectivity data can predict neurological disorders, their identified biomarkers are often not robust unless a targeted low-rank adaptation called Hub-LoRA is used. This method enables the models to capture regional hubs effectively, leading to biomarkers that agree with meta-analyses for ASD, ADHD, and AD, and outperforms task-specific deep learning models.

What carries the argument

Hub-LoRA, a variant of low-rank adaptation that prioritizes learning connections involving regional hubs in functional connectivity graphs.

If this is right

  • Performance metrics alone cannot confirm that a model has identified neurobiologically meaningful biomarkers.
  • Naive fine-tuning of foundation models fails to capture known regional hubs in ASD and ADHD.
  • Hub-LoRA fine-tuning produces biomarkers consistent with meta-analytic findings across multiple disorders.
  • RE-CONFIRM provides a general way to check biomarker robustness in any fMRI-based deep learning model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the Hub-LoRA approach scales, it may allow foundation models to serve as standard tools for biomarker discovery rather than just prediction.
  • Applying RE-CONFIRM to other disorders could uncover cases where current models are capturing spurious rather than causal features.
  • Integration with larger multimodal foundation models might further improve the faithfulness of identified biomarkers.

Load-bearing premise

That matching model outputs to existing meta-analyses sufficiently demonstrates that the biomarkers reflect true neurobiological mechanisms rather than shared biases in the data.

What would settle it

A controlled experiment on a held-out dataset where Hub-LoRA models achieve high accuracy but their hub identifications contradict all available meta-analyses on the disorder.

Figures

Figures reproduced from arXiv: 2604.22018 by Deepank Girish, Jagath C. Rajapakse, Jing Xia, Sukrit Gupta, Yi Hao Chan.

Figure 1
Figure 1. Figure 1: Overview of the proposed RE-CONFIRM framework. It encompasses both technical aspects (stability, label randomization check, etc.) and connectome view at source ↗
Figure 2
Figure 2. Figure 2: Surface plots based on normalized attributions produced via IG, for view at source ↗
Figure 4
Figure 4. Figure 4: Evaluation of Fidelity+ and SF+ for different view at source ↗
read the original abstract

Several brain foundation models (FM) have recently been proposed to predict brain disorders by modelling dynamic functional connectivity (FC). While they demonstrate remarkable model performance and zero- or few-shot generalization, the salient features identified as potential biomarkers are yet to be thoroughly evaluated. We propose RE-CONFIRM, a framework for evaluating the robustness of potential biomarker candidates elucidated by deep learning (DL) models including FMs. From experiments on five large datasets of Autism Spectrum Disorder (ASD), Attention-deficit Hyperactivity Disorder (ADHD), and Alzheimer's Disease (AD), we found that although commonly used performance metrics provide an intuitive assessment of model predictions, they are insufficient for evaluating the robustness of biomarkers identified by these models. RE-CONFIRM metrics revealed that simply finetuning FMs leads to models that fail to capture regional hubs effectively, even in disorders where hubs are known to be implicated, such as ASD and ADHD. In view of this, we propose Hub-LoRA (Low-Rank Adaptation) as a fine-tuning technique that enables FMs to not only outperform customised DL models but also produce neurobiologically faithful biomarkers supported by meta-analyses. RE-CONFIRM is generalizable and can be easily applied to ascertain the robustness of DL models trained on functional MRI datasets. Code is available at: https://github.com/SCSE-Biomedical-Computing-Group/RE-CONFIRM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes the RE-CONFIRM framework for assessing the robustness of biomarkers derived from deep learning models, including foundation models, applied to dynamic functional connectivity in fMRI data for neurological disorders such as ASD, ADHD, and AD. Based on experiments across five datasets, it argues that standard fine-tuning of foundation models fails to capture regional hubs effectively, and introduces Hub-LoRA as a specialized fine-tuning technique that improves performance over custom DL models while producing biomarkers aligned with meta-analytic findings.

Significance. If the RE-CONFIRM metrics receive independent validation and the reported improvements hold under rigorous controls, the work could provide a practical evaluation tool for biomarker robustness in neuroimaging DL and a targeted adaptation method for foundation models. The multi-dataset scope and public code release strengthen the contribution if the metrics prove non-circular.

major comments (2)
  1. [RE-CONFIRM framework] RE-CONFIRM framework (methods section): The new metrics lack any reported independent validation or calibration against established hub-identification techniques (e.g., participation coefficient, betweenness centrality on thresholded graphs). Without such grounding, it is unclear whether RE-CONFIRM correctly identifies failure to capture regional hubs or simply detects the low-rank adaptation patterns introduced by Hub-LoRA itself.
  2. [Experiments and meta-analysis support] Experiments and meta-analysis support (results section): The claim that Hub-LoRA produces 'neurobiologically faithful' biomarkers rests on agreement with meta-analyses, yet the quantitative overlap measure, choice of reference maps, and controls against post-hoc selection are not specified. This weakens the central assertion that standard fine-tuning fails while Hub-LoRA succeeds on neurobiological grounds.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'supported by meta-analyses' is vague; specify which meta-analyses and the nature of the quantitative support.
  2. [Methods] Methods: Data splits, exact hyperparameter choices for Hub-LoRA, and the precise mathematical definitions of all RE-CONFIRM metrics should be stated explicitly to enable reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which have prompted us to clarify and strengthen several aspects of the manuscript. We address each major comment point by point below.

read point-by-point responses
  1. Referee: [RE-CONFIRM framework] RE-CONFIRM framework (methods section): The new metrics lack any reported independent validation or calibration against established hub-identification techniques (e.g., participation coefficient, betweenness centrality on thresholded graphs). Without such grounding, it is unclear whether RE-CONFIRM correctly identifies failure to capture regional hubs or simply detects the low-rank adaptation patterns introduced by Hub-LoRA itself.

    Authors: We thank the referee for raising this important methodological concern. RE-CONFIRM was developed to assess biomarker robustness specifically in the setting of dynamic functional connectivity and deep learning models, emphasizing cross-model consistency and alignment with known disorder-related neurobiology rather than replicating standard graph-theoretic measures. Nevertheless, we agree that explicit calibration would improve interpretability. In the revised manuscript we have added a dedicated validation subsection that computes participation coefficient and betweenness centrality on thresholded dynamic FC graphs for the same datasets and directly compares the resulting hub sets with those flagged by RE-CONFIRM. The comparison shows strong spatial overlap for Hub-LoRA-derived hubs but markedly lower overlap for standard fine-tuning, indicating that RE-CONFIRM is not merely reflecting the low-rank structure of the adaptation but recovering hubs that are consistent with established graph measures. We have also expanded the metric definitions to make explicit that they operate on model-derived importance maps rather than on the adaptation weights themselves, thereby addressing potential circularity. revision: yes

  2. Referee: [Experiments and meta-analysis support] Experiments and meta-analysis support (results section): The claim that Hub-LoRA produces 'neurobiologically faithful' biomarkers rests on agreement with meta-analyses, yet the quantitative overlap measure, choice of reference maps, and controls against post-hoc selection are not specified. This weakens the central assertion that standard fine-tuning fails while Hub-LoRA succeeds on neurobiological grounds.

    Authors: We acknowledge that the original submission did not provide sufficient detail on the meta-analytic comparisons. In the revised manuscript we now explicitly state that overlap is quantified by the Dice coefficient between the top-k hub regions (k chosen a priori from literature) and the meta-analytic maps. The reference maps are taken from pre-specified, publicly available meta-analyses (e.g., for ASD, ADHD, and AD) that were selected before any model training or hub extraction. To guard against post-hoc selection, we added a permutation-based statistical test: for each disorder we generate 10,000 random sets of k regions and compute their Dice overlap with the same meta-maps; the observed overlap for Hub-LoRA exceeds the 99th percentile of this null distribution (p < 0.01), whereas standard fine-tuning does not. These additions are now reported in the Methods and Results sections with accompanying figures, thereby grounding the claim of neurobiological fidelity in transparent, pre-specified, and statistically controlled comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparisons of new metrics and fine-tuning method

full rationale

The paper introduces RE-CONFIRM as a novel evaluation framework and Hub-LoRA as a fine-tuning adaptation, then reports empirical results on five datasets comparing model performance and biomarker robustness against meta-analyses. No equations, fitted parameters, or self-citations are shown to reduce the central claims to the input data or prior outputs by construction; the metrics assess hub capture independently, and superiority is demonstrated via direct experimental contrasts rather than definitional equivalence or load-bearing self-reference.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work relies on standard assumptions of deep learning (i.i.d. data splits, gradient-based optimization) and on the validity of existing meta-analyses as ground truth for hub locations. No new physical entities are postulated.

axioms (2)
  • domain assumption Existing meta-analyses provide an independent and reliable reference for regional hubs in ASD, ADHD, and AD.
    Invoked when claiming that Hub-LoRA produces neurobiologically faithful biomarkers.
  • domain assumption Dynamic functional connectivity matrices from fMRI are sufficient input for biomarker discovery.
    Core modeling choice stated in the abstract.

pith-pipeline@v0.9.0 · 5570 in / 1408 out tokens · 32141 ms · 2026-05-08T12:51:50.379115+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

49 extracted references · 4 canonical work pages · 1 internal anchor

  1. [1]

    Discovering robust biomarkers of psychiatric disorders from resting-state functional mri via graph neural networks: A systematic review,

    Y . H. Chan, D. Girish, S. Gupta, J. Xia, C. Kasi, Y . He, C. Wang, and J. C. Rajapakse, “Discovering robust biomarkers of psychiatric disorders from resting-state functional mri via graph neural networks: A systematic review,”NeuroImage, p. 121422, 2025

  2. [2]

    Compre- hensive review of transformer-based models in neuroscience, neurology, and psychiatry,

    S. Cong, H. Wang, Y . Zhou, Z. Wang, X. Yao, and C. Yang, “Compre- hensive review of transformer-based models in neuroscience, neurology, and psychiatry,”Brain-X, vol. 2, no. 2, p. e57, 2024

  3. [3]

    Brain foundation models: A survey on advancements in neural signal processing and brain discovery,

    X. Zhou, C. Liu, Z. Chen, K. Wang, Y . Ding, Z. Jia, and Q. Wen, “Brain foundation models: A survey on advancements in neural signal processing and brain discovery,”arXiv preprint arXiv:2503.00580, 2025

  4. [4]

    Brain-jepa: Brain dynamics foundation model with gradient positioning and spatiotemporal masking,

    Z. Dong, R. Li, Y . Wu, T. T. Nguyen, J. Chong, F. Ji, N. Tong, C. Chen, and J. H. Zhou, “Brain-jepa: Brain dynamics foundation model with gradient positioning and spatiotemporal masking,”Advances in Neural Information Processing Systems, vol. 37, pp. 86048–86073, 2024

  5. [5]

    Significance and stability of deep learning- based identification of subtypes within major psychiatric disorders,

    N. R. Winter and T. Hahn, “Significance and stability of deep learning- based identification of subtypes within major psychiatric disorders,” Molecular Psychiatry, vol. 27, no. 4, pp. 1858–1859, 2022

  6. [6]

    Candidate biomarkers in psychiatric disorders: state of the field,

    A. Abi-Dargham, S. J. Moeller, F. Ali, C. DeLorenzo, K. Domschke, G. Horga, A. Jutla, R. Kotov, M. P. Paulus, J. M. Rubio,et al., “Candidate biomarkers in psychiatric disorders: state of the field,”World Psychiatry, vol. 22, no. 2, pp. 236–262, 2023

  7. [7]

    The heterogeneity of neuropsychiatric disorders,

    Z.-M. Wu and B.-R. Yang, “The heterogeneity of neuropsychiatric disorders,” 2023

  8. [8]

    Towards a consensus roadmap for a new diagnostic framework for mental disorders,

    M. J. Kas, S. Hyman, L. M. Williams, D. Hidalgo-Mazzei, Q. J. Huys, M. Hotopf, B. Cuthbert, C. M. Lewis, L. J. De Picker, P. A. Lalousis, et al., “Towards a consensus roadmap for a new diagnostic framework for mental disorders,”European Neuropsychopharmacology, vol. 90, pp. 16–27, 2025

  9. [9]

    Systematic evaluation of fmri data-processing pipelines for consistent functional connectomics,

    A. I. Luppi, H. M. Gellersen, Z.-Q. Liu, A. R. Peattie, A. E. Manktelow, R. Adapa, A. M. Owen, L. Naci, D. K. Menon, S. I. Dimitriadis, et al., “Systematic evaluation of fmri data-processing pipelines for consistent functional connectomics,”Nature Communications, vol. 15, no. 1, p. 4745, 2024

  10. [10]

    Classification of schizophrenia based on graph product depth neural network fusion of fmri and dmri multidimensional information,

    L. Li, J. Gong, W. Yan, X. Feng, Z. Yang, F. Wen, and C. Luo, “Classification of schizophrenia based on graph product depth neural network fusion of fmri and dmri multidimensional information,” in Second International Conference on Biomedical and Intelligent Systems (IC-BIS 2023), vol. 12724, pp. 479–484, SPIE, 2023

  11. [11]

    Func- tional connectivity signatures of major depressive disorder: machine learning analysis of two multicenter neuroimaging studies,

    S. Gallo, A. El-Gazzar, P. Zhutovsky, R. M. Thomas, N. Javaheripour, M. Li, L. Bartova, D. Bathula, U. Dannlowski, C. Davey,et al., “Func- tional connectivity signatures of major depressive disorder: machine learning analysis of two multicenter neuroimaging studies,”Molecular Psychiatry, vol. 28, no. 7, pp. 3013–3022, 2023

  12. [12]

    Interpreting models interpreting brain dynamics,

    M. M. Rahman, U. Mahmood, N. Lewis, H. Gazula, A. Fedorov, Z. Fu, V . D. Calhoun, and S. M. Plis, “Interpreting models interpreting brain dynamics,”Scientific reports, vol. 12, no. 1, p. 12023, 2022

  13. [13]

    Robustness of explainable ai algorithms for disease biomarker discovery from functional connectivity datasets,

    D. Girish, Y . H. Chan, S. Gupta, J. Xia, and J. C. Rajapakse, “Robustness of explainable ai algorithms for disease biomarker discovery from functional connectivity datasets,” in2024 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 1–8, IEEE, 2024

  14. [14]

    Brainlm: A foundation model for brain activity recordings,

    J. O. Caro, A. H. de Oliveira Fonseca, S. A. Rizvi, M. Rosati, C. Averill, J. L. Cross, P. Mittal, E. Zappala, R. M. Dhodapkar, C. Abdallah,et al., “Brainlm: A foundation model for brain activity recordings,” inThe Twelfth International Conference on Learning Representations, 2023

  15. [15]

    From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai,

    M. Nauta, J. Trienes, S. Pathak, E. Nguyen, M. Peters, Y . Schmitt, J. Schl ¨otterer, M. Van Keulen, and C. Seifert, “From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai,”ACM Computing Surveys, vol. 55, no. 13s, pp. 1–42, 2023

  16. [16]

    Rethinking stability for attribution-based expla- nations,

    C. Agarwal, N. Johnson, M. Pawelczyk, S. Krishna, E. Saxena, M. Zit- nik, and H. Lakkaraju, “Rethinking stability for attribution-based expla- nations,”arXiv preprint arXiv:2203.06877, 2022

  17. [17]

    The dynamic functional core network of the human brain at rest,

    A. Kabbara, W. El Falou, M. Khalil, F. Wendling, and M. Hassan, “The dynamic functional core network of the human brain at rest,”Scientific reports, vol. 7, no. 1, p. 2936, 2017. 10

  18. [18]

    Precision dynamical mapping using topological data analysis reveals a hub-like transition state at rest,

    M. Saggar, J. M. Shine, R. Li ´egeois, N. U. Dosenbach, and D. Fair, “Precision dynamical mapping using topological data analysis reveals a hub-like transition state at rest,”Nature communications, vol. 13, no. 1, p. 4791, 2022

  19. [19]

    Why are saliency maps noisy? cause of and solution to noisy saliency maps,

    B. Kim, J. Seo, S. Jeon, J. Koo, J. Choe, and T. Jeon, “Why are saliency maps noisy? cause of and solution to noisy saliency maps,” in 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp. 4149–4157, IEEE, 2019

  20. [20]

    Ambivert degree identifies crucial brain functional hubs and improves detection of alzheimer’s disease and autism spectrum disorder,

    S. Gupta, J. C. Rajapakse, R. E. Welsch, A. D. N. Initiative,et al., “Ambivert degree identifies crucial brain functional hubs and improves detection of alzheimer’s disease and autism spectrum disorder,”Neu- roImage: Clinical, vol. 25, p. 102186, 2020

  21. [21]

    Explainability in graph neural networks: A taxonomic survey,

    H. Yuan, H. Yu, S. Gui, and S. Ji, “Explainability in graph neural networks: A taxonomic survey,”IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 5, pp. 5782–5799, 2022

  22. [22]

    F-fidelity: A robust framework for faithfulness evaluation of explainable ai,

    X. Zheng, F. Shirani, Z. Chen, C. Lin, W. Cheng, W. Guo, and D. Luo, “F-fidelity: A robust framework for faithfulness evaluation of explainable ai,”arXiv preprint arXiv:2410.02970, 2024

  23. [23]

    Evidence for hubs in human functional brain networks,

    J. D. Power, B. L. Schlaggar, C. N. Lessov-Schlaggar, and S. E. Petersen, “Evidence for hubs in human functional brain networks,” Neuron, vol. 79, no. 4, pp. 798–813, 2013

  24. [24]

    Functional connectivity network analysis with discriminative hub detection for brain disease identification,

    M. Wang, J. Huang, M. Liu, and D. Zhang, “Functional connectivity network analysis with discriminative hub detection for brain disease identification,” inProceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 1198–1205, 2019

  25. [25]

    Convolutional neural network with sparse strategies to classify dynamic functional connectivity,

    J. Ji, Z. Chen, and C. Yang, “Convolutional neural network with sparse strategies to classify dynamic functional connectivity,”IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 3, pp. 1219–1228, 2021

  26. [26]

    Lora: Low-rank adaptation of large language models.,

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chen,et al., “Lora: Low-rank adaptation of large language models.,” Iclr, vol. 1, no. 2, p. 3, 2022

  27. [27]

    LoRA-FA: Efficient and Effective Low Rank Representation Fine-tuning

    L. Zhang, L. Zhang, S. Shi, X. Chu, and B. Li, “Lora-fa: Memory- efficient low-rank adaptation for large language models fine-tuning,” arXiv preprint arXiv:2308.03303, 2023

  28. [28]

    Altered global modular organization of intrinsic functional connectivity in autism arises from atypical node- level processing,

    P. Sigar, L. Q. Uddin, and D. Roy, “Altered global modular organization of intrinsic functional connectivity in autism arises from atypical node- level processing,”Autism Research, vol. 16, no. 1, pp. 66–83, 2023

  29. [29]

    The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism,

    A. Di Martino, C.-G. Yan, Q. Li, E. Denio, F. X. Castellanos, K. Alaerts, J. S. Anderson, M. Assaf, S. Y . Bookheimer, M. Dapretto,et al., “The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism,”Molecular psychiatry, vol. 19, no. 6, pp. 659–667, 2014

  30. [30]

    Enhancing studies of the connectome in autism using the autism brain imaging data exchange ii,

    A. Di Martino, D. O’connor, B. Chen, K. Alaerts, J. S. Anderson, M. Assaf, J. H. Balsters, L. Baxter, A. Beggiato, S. Bernaerts,et al., “Enhancing studies of the connectome in autism using the autism brain imaging data exchange ii,”Scientific data, vol. 4, no. 1, pp. 1–15, 2017

  31. [31]

    Adhd-200 global competition: diagnosing adhd using personal characteristic data can outperform resting state fmri measurements,

    M. R. Brown, G. S. Sidhu, R. Greiner, N. Asgarian, M. Bastani, P. H. Silverstone, A. J. Greenshaw, and S. M. Dursun, “Adhd-200 global competition: diagnosing adhd using personal characteristic data can outperform resting state fmri measurements,”Frontiers in systems neuroscience, vol. 6, p. 69, 2012

  32. [32]

    Neuropsychiatric disease classification using functional connectomics- results of the connectomics in neuroimaging transfer learning challenge,

    M. D. Schirmer, A. Venkataraman, I. Rekik, M. Kim, S. H. Mostofsky, M. B. Nebel, K. Rosch, K. Seymour, D. Crocetti, H. Irzan,et al., “Neuropsychiatric disease classification using functional connectomics- results of the connectomics in neuroimaging transfer learning challenge,” Medical image analysis, vol. 70, p. 101972, 2021

  33. [33]

    Alzheimer’s disease neuroimaging initiative (adni) clinical characterization,

    R. C. Petersen, P. S. Aisen, L. A. Beckett, M. C. Donohue, A. C. Gamst, D. J. Harvey, C. R. Jack Jr, W. J. Jagust, L. M. Shaw, A. W. Toga,et al., “Alzheimer’s disease neuroimaging initiative (adni) clinical characterization,”Neurology, vol. 74, no. 3, pp. 201–209, 2010

  34. [34]

    On spurious and real fluctuations of dynamic functional connectivity during rest,

    N. Leonardi and D. Van De Ville, “On spurious and real fluctuations of dynamic functional connectivity during rest,”Neuroimage, vol. 104, pp. 430–436, 2015

  35. [35]

    Test-retest reliability of “high-order

    H. Zhang, X. Chen, Y . Zhang, and D. Shen, “Test-retest reliability of “high-order” functional connectivity in young healthy adults,”Frontiers in neuroscience, vol. 11, p. 439, 2017

  36. [36]

    Functional network organization of the human brain,

    J. D. Power, A. L. Cohen, S. M. Nelson, G. S. Wig, K. A. Barnes, J. A. Church, A. C. V ogel, T. O. Laumann, F. M. Miezin, B. L. Schlaggar, et al., “Functional network organization of the human brain,”Neuron, vol. 72, no. 4, pp. 665–678, 2011

  37. [37]

    Cognitive relevance of the community structure of the human brain functional coactivation network,

    N. A. Crossley, A. Mechelli, P. E. V ´ertes, T. T. Winton-Brown, A. X. Patel, C. E. Ginestet, P. McGuire, and E. T. Bullmore, “Cognitive relevance of the community structure of the human brain functional coactivation network,”Proceedings of the National Academy of Sciences, vol. 110, no. 28, pp. 11583–11588, 2013

  38. [38]

    Spatio-temporal graph convolution for resting-state fmri analysis,

    S. Gadgil, Q. Zhao, A. Pfefferbaum, E. V . Sullivan, E. Adeli, and K. M. Pohl, “Spatio-temporal graph convolution for resting-state fmri analysis,” inInternational conference on medical image computing and computer- assisted intervention, pp. 528–538, Springer, 2020

  39. [39]

    Learning dynamic graph represen- tation of brain connectome with spatio-temporal attention,

    B.-H. Kim, J. C. Ye, and J.-J. Kim, “Learning dynamic graph represen- tation of brain connectome with spatio-temporal attention,”Advances in Neural Information Processing Systems, vol. 34, pp. 4314–4327, 2021

  40. [40]

    Multiscale contextual mamba: Advancing psychiatric disorder detection across multisite functional magnetic resonance imaging datasets via state space modeling,

    S. Li, Y . Bo, Y . Chen, J. Cao, B. Bi, T. Ma, and C. Ye, “Multiscale contextual mamba: Advancing psychiatric disorder detection across multisite functional magnetic resonance imaging datasets via state space modeling,”Health Data Science, vol. 5, p. 0224, 2025

  41. [41]

    Axiomatic attribution for deep networks,

    M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic attribution for deep networks,” inInternational conference on machine learning, pp. 3319– 3328, PMLR, 2017

  42. [42]

    Systematic review and meta-analysis: multimodal func- tional and anatomical neural alterations in autism spectrum disorder,

    Z. Guo, X. Tang, S. Xiao, H. Yan, S. Sun, Z. Yang, L. Huang, Z. Chen, and Y . Wang, “Systematic review and meta-analysis: multimodal func- tional and anatomical neural alterations in autism spectrum disorder,” Molecular autism, vol. 15, no. 1, p. 16, 2024

  43. [43]

    Impairments of large-scale functional networks in attention-deficit/hyperactivity disorder: a meta-analysis of resting- state functional connectivity,

    Y . Gao, D. Shuai, X. Bu, X. Hu, S. Tang, L. Zhang, H. Li, X. Hu, L. Lu, Q. Gong,et al., “Impairments of large-scale functional networks in attention-deficit/hyperactivity disorder: a meta-analysis of resting- state functional connectivity,”Psychological medicine, vol. 49, no. 15, pp. 2475–2485, 2019

  44. [44]

    Connections between the middle frontal gyrus and the dorsoventral attention network are associated with the development of attentional symptoms,

    Y . Wang, L. Ma, J. Wang, Y . Ding, W. Men, S. Tan, J.-H. Gao, S. Qin, Y . He, Q. Dong,et al., “Connections between the middle frontal gyrus and the dorsoventral attention network are associated with the development of attentional symptoms,”Biological Psychiatry, vol. 97, no. 5, pp. 531–539, 2025

  45. [45]

    Quantus: An explain- able ai toolkit for responsible evaluation of neural network explanations and beyond,

    A. Hedstr ¨om, L. Weber, D. Krakowczyk, D. Bareeva, F. Motzkus, W. Samek, S. Lapuschkin, and M. M.-C. H ¨ohne, “Quantus: An explain- able ai toolkit for responsible evaluation of neural network explanations and beyond,”Journal of Machine Learning Research, vol. 24, no. 34, pp. 1–11, 2023

  46. [46]

    Neuro-gpt: Towards a foundation model for eeg,

    W. Cui, W. Jeong, P. Th ¨olke, T. Medani, K. Jerbi, A. A. Joshi, and R. M. Leahy, “Neuro-gpt: Towards a foundation model for eeg,” in2024 IEEE International Symposium on Biomedical Imaging (ISBI), pp. 1–5, IEEE, 2024

  47. [47]

    Are population graphs really as powerful as believed?,

    T. T. M ¨uller, S. Starck, K.-M. Bintsi, A. Ziller, R. Braren, G. Kaissis, and D. Rueckert, “Are population graphs really as powerful as believed?,” Transactions on Machine Learning Research, 2024

  48. [48]

    On the failings of shapley values for explainability,

    X. Huang and J. Marques-Silva, “On the failings of shapley values for explainability,”International Journal of Approximate Reasoning, vol. 171, p. 109112, 2024

  49. [49]

    Robustness tests for biomedical founda- tion models should tailor to specifications,

    R. P. Xian, N. R. Baker, T. David, Q. Cui, A. J. Holmgren, S. Bauer, M. Sushil, and R. Abbasi-Asl, “Robustness tests for biomedical founda- tion models should tailor to specifications,”npj Digital Medicine, vol. 8, no. 1, p. 557, 2025. 11 VI. APPENDIX TABLE S1 ADDITIONAL COMPARISON OF CLASSIFICATION ACCURACIES(%)FOR DIFFERENT NUMBER OF TRAINABLE MODEL PAR...