Graph Neural Network for Interpreting Task-fMRI Biomarkers

James S. Duncan; Juntang Zhuang; Nicha C. Dvornek; Pamela Ventola; Xiaoxiao Li; Yuan Zhou

arxiv: 1907.01661 · v2 · pith:SVDABMF3new · submitted 2019-07-02 · 💻 cs.LG · cs.CV· eess.IV· stat.ML

Graph Neural Network for Interpreting Task-fMRI Biomarkers

Xiaoxiao Li , Nicha C. Dvornek , Yuan Zhou , Juntang Zhuang , Pamela Ventola , James S. Duncan This is my paper

Pith reviewed 2026-05-25 10:43 UTC · model grok-4.3

classification 💻 cs.LG cs.CVeess.IVstat.ML

keywords graph neural networksautism spectrum disordertask-fMRIbiomarker interpretationbrain networksfeature importanceinductive learningASD classification

0 comments

The pith

An inductive graph neural network embeds task-fMRI brain graphs to classify autism spectrum disorder and identifies the brain regions driving the decisions without replacing features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out a two-stage pipeline that first trains an inductive graph neural network on brain graphs built from task-based fMRI scans to separate individuals with autism spectrum disorder from controls. The same trained model then supplies feature importance scores that point to the specific brain regions or sub-graphs used as evidence for each classification. This design removes the need to occlude or swap feature values, which would otherwise shift the data distribution and risk misleading interpretations. The approach produces high classification accuracy, yields interpretations that line up with known links to social behavior, and remains stable when the underlying brain atlas or model parameters change.

Core claim

The central claim is that an inductive GNN trained on task-fMRI graphs can both achieve high accuracy at identifying ASD and, through post-training feature importance scores, reliably surface the brain regions and sub-graphs that serve as evidence for the classifier, thereby avoiding the distribution-shift errors that arise when features must be replaced or occluded.

What carries the argument

The two-stage pipeline: an inductive GNN that embeds graphs built from different task-fMRI properties for classification, followed by feature importance scoring to extract the supporting brain regions or sub-graphs.

If this is right

The GNN reaches high accuracy on ASD identification from the constructed graphs.
GNN-based feature importance produces interpretations at least as useful as those from Random Forest.
The detected biomarkers remain consistent across different brain atlases and parameter choices.
The highlighted regions show associations with social behaviors.
The pipeline can surface previously unknown informative biomarkers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same two-stage structure could be applied to graph-structured data from other neurological conditions.
Because the GNN is inductive, the trained model could classify new subjects whose graphs were never seen during training.
Independent validation against non-fMRI measures of the same brain regions would strengthen the biomarker claims.
The method offers a template for any graph classification task where replacing features would distort the input distribution.

Load-bearing premise

The feature importance scores produced by the trained GNN actually mark true ASD-linked brain regions rather than model artifacts or the particular way the raw fMRI data were turned into graphs.

What would settle it

Generate synthetic task-fMRI graphs that contain a small number of planted regions known to drive the label, train the pipeline, and test whether the recovered importance scores exactly match the planted regions.

Figures

Figures reproduced from arXiv: 1907.01661 by James S. Duncan, Juntang Zhuang, Nicha C. Dvornek, Pamela Ventola, Xiaoxiao Li, Yuan Zhou.

**Figure 2.** Figure 2: The architecture of the GNN classifier [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: (a) Top 30 important ROIs (colored in yellow) selected by RF; (b) Top [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: (a) (c) Top scoring sub-graph and corresponding functional decoding key [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: (a) The biomarkers (red) interpreted on A1 with 20 clusters; (b)-(d) The biomarkers interpreted by different R and altas laying over on (a) with different colors; (e) The correlation between overlapped ROIs and functional keywords. 4 Conclusion and Future Work In this paper, we proposed a framework to discover ASD brain biomarkers from task-fMRI using GNN. It achieved improved accuracy and more interpretab… view at source ↗

read the original abstract

Finding the biomarkers associated with ASD is helpful for understanding the underlying roots of the disorder and can lead to earlier diagnosis and more targeted treatment. A promising approach to identify biomarkers is using Graph Neural Networks (GNNs), which can be used to analyze graph structured data, i.e. brain networks constructed by fMRI. One way to interpret important features is through looking at how the classification probability changes if the features are occluded or replaced. The major limitation of this approach is that replacing values may change the distribution of the data and lead to serious errors. Therefore, we develop a 2-stage pipeline to eliminate the need to replace features for reliable biomarker interpretation. Specifically, we propose an inductive GNN to embed the graphs containing different properties of task-fMRI for identifying ASD and then discover the brain regions/sub-graphs used as evidence for the GNN classifier. We first show GNN can achieve high accuracy in identifying ASD. Next, we calculate the feature importance scores using GNN and compare the interpretation ability with Random Forest. Finally, we run with different atlases and parameters, proving the robustness of the proposed method. The detected biomarkers reveal their association with social behaviors. We also show the potential of discovering new informative biomarkers. Our pipeline can be generalized to other graph feature importance interpretation problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete 2-stage inductive GNN pipeline for ASD classification on task-fMRI graphs plus post-hoc node importance without feature replacement, but the abstract supplies no numbers or controls to check whether the importance scores recover real biomarkers or just model artifacts.

read the letter

The main takeaway is that this work puts forward an inductive GNN trained on task-fMRI brain graphs to classify ASD, followed by a feature-importance stage that avoids the usual occlusion or replacement step. The authors argue this yields more reliable biomarkers and show the detected regions tie to social behavior measures. They also run the pipeline on multiple atlases and compare the interpretation output to a random forest baseline. That combination of inductive GNN plus non-replacement scoring is a reasonable incremental move on top of existing GNN and fMRI work, and the robustness checks across atlases are a positive step that many similar papers skip. The abstract frames the approach as addressing a known distribution-shift problem in occlusion methods, which is a fair motivation. The soft spot is that none of the accuracy claims, robustness numbers, or interpretation comparisons are quantified here—no accuracies, no dataset sizes, no error bars, no p-values. Without those, it is impossible to judge whether the importance scores actually track ASD-related regions or simply reflect how the graphs were built or how the GNN was trained. The stress-test concern about missing null models or external validation against independent ASD findings lands, because the abstract does not describe any such checks. This paper is aimed at neuroimaging groups already using GNNs on connectivity data for disorder classification. Someone in that niche could pick up the pipeline idea and the atlas-robustness test, but the lack of visible results limits how much a general reader can take away. If the full manuscript contains the missing quantitative tables, proper validation splits, and at least one control that separates faithful attribution from graph-construction artifacts, it is worth sending to referees; otherwise the central interpretation claim stays unverified.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes a 2-stage inductive GNN pipeline to classify ASD from task-fMRI brain graphs (first stage) and then extract biomarkers via post-hoc feature importance scores on nodes/sub-graphs (second stage), avoiding feature occlusion/replacement. It claims the GNN achieves high accuracy, yields superior interpretation compared to Random Forest, is robust across atlases and parameters, detects biomarkers linked to social behaviors, and can discover new informative ones. The approach is positioned as generalizable to other graph interpretation tasks.

Significance. If the feature importance scores are shown to faithfully recover true ASD biomarkers (rather than GNN or graph-construction artifacts), the work would address a practical need for distribution-preserving interpretation in neuroimaging GNNs and could support more reliable biomarker discovery. The emphasis on inductive GNNs and atlas robustness is a constructive direction for fMRI analysis.

major comments (3)

[Abstract] Abstract: the central claim that the pipeline enables 'reliable biomarker interpretation' rests on unshown quantitative evidence; no accuracy values, dataset sizes, cross-validation scheme, error bars, or statistical comparisons to Random Forest are provided, making it impossible to evaluate whether the GNN stage or the importance scores perform as asserted.
[Abstract] Abstract (2-stage pipeline description): the assertion that post-training feature importance scores identify ASD-associated regions/sub-graphs is load-bearing for the main contribution, yet no validation against null models, synthetic graphs with planted signals, or independent ASD literature is described to distinguish faithful recovery from artifacts induced by atlas choice, connectivity definition, or GNN inductive biases.
[Abstract] Abstract (robustness claim): the statement that results hold 'with different atlases and parameters' is presented without any tabulated metrics, variance across runs, or statistical tests, which directly undermines the robustness component of the central claim.

minor comments (1)

[Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., accuracy or AUC) to allow readers to gauge the scale of the reported performance.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We will revise the abstract to include the requested quantitative details, validation approaches, and robustness metrics from the full manuscript. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the pipeline enables 'reliable biomarker interpretation' rests on unshown quantitative evidence; no accuracy values, dataset sizes, cross-validation scheme, error bars, or statistical comparisons to Random Forest are provided, making it impossible to evaluate whether the GNN stage or the importance scores perform as asserted.

Authors: We agree that the abstract should contain these supporting details. The full manuscript reports GNN classification accuracies, the ABIDE dataset size (ASD and control subjects), the cross-validation procedure, error bars across folds, and statistical comparisons to Random Forest. We will update the abstract to include key numerical results and comparisons. revision: yes
Referee: [Abstract] Abstract (2-stage pipeline description): the assertion that post-training feature importance scores identify ASD-associated regions/sub-graphs is load-bearing for the main contribution, yet no validation against null models, synthetic graphs with planted signals, or independent ASD literature is described to distinguish faithful recovery from artifacts induced by atlas choice, connectivity definition, or GNN inductive biases.

Authors: The manuscript validates via direct comparison of GNN feature importance scores against Random Forest and by mapping identified regions to established associations with social behaviors in the ASD literature. Cross-atlas experiments further address potential artifacts from atlas or connectivity choices. We will revise the abstract to explicitly reference these empirical validations and literature alignment. Synthetic graphs with planted signals are not included in the study. revision: partial
Referee: [Abstract] Abstract (robustness claim): the statement that results hold 'with different atlases and parameters' is presented without any tabulated metrics, variance across runs, or statistical tests, which directly undermines the robustness component of the central claim.

Authors: We agree the abstract should summarize the supporting evidence. The full manuscript presents results across multiple atlases and parameter settings with consistency metrics. We will add a concise summary of these tabulated findings and consistency observations to the abstract. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical training plus post-hoc scoring with independent checks

full rationale

The paper presents a standard two-stage empirical pipeline: train an inductive GNN on task-fMRI graphs for ASD classification, then apply post-hoc feature importance scoring and compare against Random Forest while testing robustness across atlases. No equations, self-definitional steps, or fitted-input-as-prediction reductions appear in the abstract or described method; importance scores are computed after training rather than being forced by construction. No load-bearing self-citations or uniqueness theorems imported from prior author work are invoked to justify the pipeline. The derivation chain is therefore self-contained against external benchmarks and receives a score of 0.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The pipeline rests on standard neuroimaging assumptions about graph construction from fMRI and on the validity of the chosen importance scoring method; no new physical entities are introduced.

free parameters (2)

GNN architecture and training hyperparameters
Chosen to achieve reported high accuracy; tested across parameters for robustness but still fitted to the data.
Brain atlas choice for node definition
Multiple atlases evaluated to demonstrate stability, implying the choice affects results.

axioms (2)

domain assumption Task-fMRI time series can be meaningfully converted into graphs whose nodes are atlas regions and whose edges capture functional connectivity.
Invoked when constructing the input graphs for the GNN.
ad hoc to paper Feature importance scores derived from the trained GNN correspond to biologically meaningful biomarkers rather than model-specific artifacts.
This is the central premise that allows the second stage to produce interpretable biomarkers.

pith-pipeline@v0.9.0 · 5783 in / 1513 out tokens · 43018 ms · 2026-05-25T10:43:59.647423+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose an inductive GNN to embed the graphs containing different properties of task-fMRI for identifying ASD and then discover the brain regions/sub-graphs used as evidence for the GNN classifier.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The detected biomarkers reveal their association with social behaviors.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 3 internal anchors

[1]

In: Advances in Neural Infor- mation Processing Systems

Adebayo, J., et al.: Sanity checks for saliency maps. In: Advances in Neural Infor- mation Processing Systems. pp. 9505–9515 (2018)

work page 2018
[2]

Towards Sparse Hierarchical Graph Classifiers

Cangea, C., et al.: Towards sparse hierarchical graph classiﬁers. arXiv preprint arXiv:1811.01287 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[3]

Psychometrika 35(3), 283–319 (1970)

Carroll, J.D., Chang, J.J.: Analysis of individual diﬀerences in multidimensional scaling via an n-way generalization of eckart-young decomposition. Psychometrika 35(3), 283–319 (1970)

work page 1970
[4]

Neuroimage31(3), 968–980 (2006)

Desikan, R.S., et al.: An automated labeling system for subdividing the human cerebral cortex on mri scans into gyral based regions of interest. Neuroimage31(3), 968–980 (2006)

work page 2006
[5]

Neuroimage 53(1), 1–15 (2010)

Destrieux, C., et al.: Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. Neuroimage 53(1), 1–15 (2010)

work page 2010
[6]

Fast Graph Representation Learning with PyTorch Geometric

Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch Geometric. CoRR abs/1903.02428 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1903
[7]

In: ICML 2017

Gilmer, J., et al.: Neural message passing for quantum chemistry. In: ICML 2017. pp. 1263–1272. JMLR. org (2017)

work page 2017
[8]

Frontiers in psychiatry 5 (2014) Graph Neural Network for Interpreting Task-fMRI Biomarkers 9

Goldani, A.A., et al.: Biomarkers in autism. Frontiers in psychiatry 5 (2014) Graph Neural Network for Interpreting Task-fMRI Biomarkers 9

work page 2014
[9]

PNAS (2010)

Kaiser, M.D., et al.: Neural signatures of autism. PNAS (2010)

work page 2010
[10]

Semi-Supervised Classification with Graph Convolutional Networks

Kipf, T.N., Welling, M.: Semi-supervised classiﬁcation with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[11]

In: MICCAI (2017)

Ktena, S.I., et al.: Distance metric learning using graph convolutional networks: Application to functional brain networks. In: MICCAI (2017)

work page 2017
[12]

Physica A: Statistical Mechanics and its Applications 431, 29–45 (2015)

Loe, C.W., Jensen, H.J.: Comparison of communities detection algorithms for mul- tiplex. Physica A: Statistical Mechanics and its Applications 431, 29–45 (2015)

work page 2015
[13]

Encyclopedia of Mathematics (2001)

Nishii, R.: Box-Cox transformation. Encyclopedia of Mathematics (2001)

work page 2001
[14]

Translational psychiatry 6(11), e948 (2016)

Yang, D., et al.: Brain responses to biological motion predict treatment outcome in young children with autism. Translational psychiatry 6(11), e948 (2016)

work page 2016
[15]

Nature methods 8(8), 665 (2011)

Yarkoni, T., et al.: Large-scale automated synthesis of human functional neu- roimaging data. Nature methods 8(8), 665 (2011)

work page 2011

[1] [1]

In: Advances in Neural Infor- mation Processing Systems

Adebayo, J., et al.: Sanity checks for saliency maps. In: Advances in Neural Infor- mation Processing Systems. pp. 9505–9515 (2018)

work page 2018

[2] [2]

Towards Sparse Hierarchical Graph Classifiers

Cangea, C., et al.: Towards sparse hierarchical graph classiﬁers. arXiv preprint arXiv:1811.01287 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[3] [3]

Psychometrika 35(3), 283–319 (1970)

Carroll, J.D., Chang, J.J.: Analysis of individual diﬀerences in multidimensional scaling via an n-way generalization of eckart-young decomposition. Psychometrika 35(3), 283–319 (1970)

work page 1970

[4] [4]

Neuroimage31(3), 968–980 (2006)

Desikan, R.S., et al.: An automated labeling system for subdividing the human cerebral cortex on mri scans into gyral based regions of interest. Neuroimage31(3), 968–980 (2006)

work page 2006

[5] [5]

Neuroimage 53(1), 1–15 (2010)

Destrieux, C., et al.: Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. Neuroimage 53(1), 1–15 (2010)

work page 2010

[6] [6]

Fast Graph Representation Learning with PyTorch Geometric

Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch Geometric. CoRR abs/1903.02428 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1903

[7] [7]

In: ICML 2017

Gilmer, J., et al.: Neural message passing for quantum chemistry. In: ICML 2017. pp. 1263–1272. JMLR. org (2017)

work page 2017

[8] [8]

Frontiers in psychiatry 5 (2014) Graph Neural Network for Interpreting Task-fMRI Biomarkers 9

Goldani, A.A., et al.: Biomarkers in autism. Frontiers in psychiatry 5 (2014) Graph Neural Network for Interpreting Task-fMRI Biomarkers 9

work page 2014

[9] [9]

PNAS (2010)

Kaiser, M.D., et al.: Neural signatures of autism. PNAS (2010)

work page 2010

[10] [10]

Semi-Supervised Classification with Graph Convolutional Networks

Kipf, T.N., Welling, M.: Semi-supervised classiﬁcation with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[11] [11]

In: MICCAI (2017)

Ktena, S.I., et al.: Distance metric learning using graph convolutional networks: Application to functional brain networks. In: MICCAI (2017)

work page 2017

[12] [12]

Physica A: Statistical Mechanics and its Applications 431, 29–45 (2015)

Loe, C.W., Jensen, H.J.: Comparison of communities detection algorithms for mul- tiplex. Physica A: Statistical Mechanics and its Applications 431, 29–45 (2015)

work page 2015

[13] [13]

Encyclopedia of Mathematics (2001)

Nishii, R.: Box-Cox transformation. Encyclopedia of Mathematics (2001)

work page 2001

[14] [14]

Translational psychiatry 6(11), e948 (2016)

Yang, D., et al.: Brain responses to biological motion predict treatment outcome in young children with autism. Translational psychiatry 6(11), e948 (2016)

work page 2016

[15] [15]

Nature methods 8(8), 665 (2011)

Yarkoni, T., et al.: Large-scale automated synthesis of human functional neu- roimaging data. Nature methods 8(8), 665 (2011)

work page 2011