AIMing for Standardised Explainability Evaluation in GNNs: A Framework and Case Study on Graph Kernel Networks
Pith reviewed 2026-05-19 17:22 UTC · model grok-4.3
The pith
The AIM framework evaluates GNN explainability by measuring accuracy together with instance-level and model-level explanations, enabling targeted improvements such as the xGKN model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AIM measures Accuracy, Instance-level explanations, and Model-level explanations to evaluate explainability in inherently interpretable GNNs. Applied to Graph Kernel Networks, the measures expose specific limitations in how those networks generate explanations. The resulting insights support construction of an updated model, xGKN, that preserves high predictive accuracy while showing clearer instance-level and model-level explanations.
What carries the argument
The AIM framework, which evaluates a model by combining its predictive accuracy with separate assessments of the explanations it produces for individual instances and for its global behavior.
If this is right
- Graph Kernel Networks can be examined for concrete explanation weaknesses using the three AIM scores.
- An updated model xGKN can be produced that retains the original accuracy level.
- The new model xGKN registers higher scores on both instance-level and model-level explanation measures.
- The same AIM pipeline can be applied to other inherently interpretable networks such as prototype networks.
Where Pith is reading between the lines
- AIM-style scoring could later be tested on post-hoc explanation methods for standard GNNs to see whether it produces consistent rankings across model types.
- If the three measures prove stable, they could serve as a shared benchmark when teams compare new interpretable graph models in safety-critical settings.
- One could check whether the xGKN changes also improve performance on downstream tasks that reward human-understandable outputs.
Load-bearing premise
That accuracy together with instance-level and model-level explanation scores together give a complete enough picture of explainability to let direct changes improve the model without extra domain rules.
What would settle it
Running AIM on the original Graph Kernel Networks and finding that the derived xGKN shows no gain in instance-level or model-level scores while accuracy stays the same, or that the three measures cannot separate models with visibly different explanation qualities.
Figures
read the original abstract
Graph Neural Networks (GNNs) have advanced significantly in handling graph-structured data, but a comprehensive framework for evaluating explainability remains lacking. Existing evaluation frameworks primarily involve post-hoc explanations, and operate in the setting where multiple methods generate a suite of explanations for a single model. This makes comparison of explanations across models difficult. Evaluation of inherently interpretable models often targets a specific aspect of interpretability relevant to the model, but remains underdeveloped in terms of generating insight across a suite of measures. We introduce AIM, a comprehensive framework that addresses these limitations by measuring Accuracy, Instance-level explanations, and Model-level explanations. AIM is formulated with minimal constraints to enhance flexibility and facilitate broad applicability. Here, we use AIM in a pipeline, extracting explanations from inherently interpretable GNNs such as graph kernel networks (GKNs) and prototype networks (PNs), evaluating these explanations with AIM, identifying their limitations and obtaining insights to their characteristics. Taking GKNs as a case study, we show how the insights obtained from AIM can be used to develop an updated model, xGKN, that maintains high accuracy while demonstrating improved explainability. Our approach aims to advance the field of Explainable AI (XAI) for GNNs, providing more robust and practical solutions for understanding and improving complex models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the AIM framework for standardized explainability evaluation of Graph Neural Networks, particularly inherently interpretable models. AIM assesses Accuracy, Instance-level explanations, and Model-level explanations with minimal constraints for flexibility. The authors apply AIM to Graph Kernel Networks (GKNs) and Prototype Networks (PNs) as case studies, extract explanations, identify limitations, and use the resulting insights to propose an updated xGKN model that preserves high accuracy while improving explainability.
Significance. If the empirical results hold and the three-axis evaluation proves actionable and generalizable, the work could help standardize explainability assessment for GNNs beyond post-hoc methods and provide a concrete pipeline from diagnosis to model revision. The emphasis on inherently interpretable architectures and the xGKN case study are strengths if supported by quantitative validation, baselines, and reproducibility details.
major comments (2)
- [Abstract and §4] Abstract and §4 (case study): the claim that AIM yields an xGKN with 'improved explainability' while 'maintaining high accuracy' is presented without quantitative results, error bars, baseline comparisons against prior GKN variants, or statistical validation, making it impossible to assess whether the data support the central improvement claim.
- [§3] §3 (AIM formulation): the assertion that the three measures together provide a sufficiently complete and flexible evaluation without additional constraints or domain-specific adjustments is not accompanied by a systematic argument or ablation showing that key aspects (e.g., faithfulness under distribution shift, stability across graph sizes, or alignment with human-understandable substructures) are covered; if any fall outside the three axes the pipeline to xGKN becomes under-specified.
minor comments (2)
- [§3.1] Clarify the precise operational definitions and scoring procedures for 'instance-level' and 'model-level' explanations so that readers can reproduce the AIM scores on new GKN or PN architectures.
- [§4] Add a table or figure summarizing the AIM scores for the original GKN versus xGKN (and versus PNs) with explicit metrics and confidence intervals.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating planned revisions where appropriate to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (case study): the claim that AIM yields an xGKN with 'improved explainability' while 'maintaining high accuracy' is presented without quantitative results, error bars, baseline comparisons against prior GKN variants, or statistical validation, making it impossible to assess whether the data support the central improvement claim.
Authors: We acknowledge that the presentation of results for xGKN would benefit from greater rigor. The manuscript reports accuracy values and AIM scores comparing xGKN to the original GKN and other baselines, but we agree these could be augmented. In the revision we will include error bars from multiple independent runs, explicit tabular comparisons against prior GKN variants using identical metrics, and statistical significance tests to support the claims of maintained accuracy and improved explainability. revision: yes
-
Referee: [§3] §3 (AIM formulation): the assertion that the three measures together provide a sufficiently complete and flexible evaluation without additional constraints or domain-specific adjustments is not accompanied by a systematic argument or ablation showing that key aspects (e.g., faithfulness under distribution shift, stability across graph sizes, or alignment with human-understandable substructures) are covered; if any fall outside the three axes the pipeline to xGKN becomes under-specified.
Authors: AIM is intentionally formulated around the three axes to balance predictive fidelity with local and global interpretability while imposing minimal constraints. We will expand the discussion in §3 to provide a clearer argument that faithfulness is captured through accuracy and instance-level fidelity measures, stability through model-level consistency checks, and alignment with human-interpretable substructures via the kernel and prototype mechanisms. Although the original submission does not contain a dedicated ablation, we will add a concise justification (with supporting references) showing how these aspects fall within the existing axes; a full ablation can be included if the editor deems it necessary. revision: partial
Circularity Check
AIM framework introduction and xGKN case study show no circular reductions
full rationale
The paper introduces AIM as a new evaluation framework measuring Accuracy, Instance-level explanations, and Model-level explanations with minimal constraints, then applies it to inherently interpretable models like GKNs and PNs to extract insights and develop an updated xGKN. No equations, derivations, or self-referential steps are present that reduce any claimed prediction, improvement, or uniqueness to fitted inputs or prior self-citations by construction. The central claims rest on the framework's novelty and its use on external case studies, making the overall derivation self-contained without load-bearing circular elements.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Existing evaluation frameworks primarily involve post-hoc explanations and operate in the setting where multiple methods generate explanations for a single model.
invented entities (2)
-
AIM framework
no independent evidence
-
xGKN model
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce AIM, a comprehensive framework that addresses these limitations by measuring Accuracy, Instance-level explanations, and Model-level explanations.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
GNNExplainer: Generating Explanations for Graph Neural Networks , url =
Zhitao Ying and Dylan Bourgeois and Jiaxuan You and Marinka Zitnik and Jure Leskovec , bibsource =. GNNExplainer: Generating Explanations for Graph Neural Networks , url =. Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada , editor =
work page 2019
-
[2]
Parameterized Explainer for Graph Neural Network , url =
Dongsheng Luo and Wei Cheng and Dongkuan Xu and Wenchao Yu and Bo Zong and Haifeng Chen and Xiang Zhang , bibsource =. Parameterized Explainer for Graph Neural Network , url =. Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual , editor =
work page 2020
-
[3]
KerGNNs: Interpretable Graph Neural Networks with Graph Kernels , url =
Aosong Feng and Chenyu You and Shiqiang Wang and Leandros Tassiulas , bibsource =. KerGNNs: Interpretable Graph Neural Networks with Graph Kernels , url =. Thirty-Sixth
-
[4]
ProtGNN: Towards Self-Explaining Graph Neural Networks , url =
Zaixi Zhang and Qi Liu and Hao Wang and Chengqiang Lu and Cheekong Lee , bibsource =. ProtGNN: Towards Self-Explaining Graph Neural Networks , url =. Thirty-Sixth
- [5]
-
[6]
The Co-12 Recipe for Evaluating Interpretable Part-Prototype Image Classifiers , url =
Meike Nauta and Christin Seifert , journal =. The Co-12 Recipe for Evaluating Interpretable Part-Prototype Image Classifiers , url =
-
[7]
and Debnath, Gargi and Shusterman, Alan J
Debnath, Asim Kumar and Lopez de Compadre, Rosa L. and Debnath, Gargi and Shusterman, Alan J. and Hansch, Corwin , doi =. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity , url =. https://doi.org/10.1021/jm00106a046 , journal =
-
[8]
Pinar Yanardag and S. V. N. Vishwanathan , bibsource =. Deep Graph Kernels , url =. Proceedings of the 21th. doi:10.1145/2783258.2783417 , editor =
-
[9]
Borgwardt, Karsten M. and Ong, Cheng Soon and Sch\". Protein function prediction via graph kernels , url =. Bioinformatics , number =. doi:10.1093/bioinformatics/bti1007 , issn =
-
[10]
Global Explainability of GNNs via Logic Combination of Learned Concepts , url =
Steve Azzolin and Antonio Longa and Pietro Barbiero and Pietro Li. Global Explainability of GNNs via Logic Combination of Learned Concepts , url =. The Eleventh International Conference on Learning Representations,
-
[11]
Scott M. Lundberg and Su. A Unified Approach to Interpreting Model Predictions , url =. Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA,
work page 2017
-
[12]
Graph Kernel Convolutions for Interpretable Classification , year =
Magdalena Proszewska and Siddharth N , html =. Graph Kernel Convolutions for Interpretable Classification , year =. Data-centric Machine Learning Research (DMLR) Workshop at the International Conference on Learning Representations (ICLR) , publisher =
-
[13]
How Powerful are Graph Neural Networks? , url =
Keyulu Xu and Weihua Hu and Jure Leskovec and Stefanie Jegelka , bibsource =. How Powerful are Graph Neural Networks? , url =. 7th International Conference on Learning Representations,
-
[14]
Explainability Techniques for Graph Convolutional Networks , url =
Federico Baldassarre and Hossein Azizpour , journal =. Explainability Techniques for Graph Convolutional Networks , url =
-
[15]
Minh N. Vu and My T. Thai , bibsource =. PGM-Explainer: Probabilistic Graphical Model Explanations for Graph Neural Networks , url =. Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual , editor =
work page 2020
-
[16]
ter Hoeve and Gabriele Tolomei and Maarten de Rijke and Fabrizio Silvestri , bibsource =
Ana Lucic and Maartje A. ter Hoeve and Gabriele Tolomei and Maarten de Rijke and Fabrizio Silvestri , bibsource =. CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks , url =. International Conference on Artificial Intelligence and Statistics,
-
[17]
On Explainability of Graph Neural Networks via Subgraph Explorations , url =
Hao Yuan and Haiyang Yu and Jie Wang and Kang Li and Shuiwang Ji , bibsource =. On Explainability of Graph Neural Networks via Subgraph Explorations , url =. Proceedings of the 38th International Conference on Machine Learning,
- [18]
-
[19]
The Eleventh International Conference on Learning Representations,
Wenqian Li and Yinchuan Li and Zhigang Li and Jianye Hao and Yan Pang , bibsource =. The Eleventh International Conference on Learning Representations,
-
[20]
Alexandre Duval and Fragkiskos D. Malliaros , journal =. GraphSVX: Shapley Value Explanations for Graph Neural Networks , url =
-
[21]
GNNShap: Scalable and Accurate
Selahattin Akkas and Ariful Azad , bibsource =. GNNShap: Scalable and Accurate. Proceedings of the. doi:10.1145/3589334.3645599 , editor =
-
[22]
A Review on Graph Neural Network Methods in Financial Applications , url =
Jianian Wang and Sheng Zhang and Yanghua Xiao and Rui Song , journal =. A Review on Graph Neural Network Methods in Financial Applications , url =
-
[23]
A Survey on Graph Neural Networks in Intelligent Transportation Systems , url =
Hourun Li and Yusheng Zhao and Zhengyang Mao and Yifang Qin and Zhiping Xiao and Jiaqi Feng and Yiyang Gu and Wei Ju and Xiao Luo and Ming Zhang , journal =. A Survey on Graph Neural Networks in Intelligent Transportation Systems , url =
-
[24]
Kipf and Max Welling , bibsource =
Thomas N. Kipf and Max Welling , bibsource =. Semi-Supervised Classification with Graph Convolutional Networks , url =. 5th International Conference on Learning Representations,
- [25]
-
[26]
IEEE Transactions on Artificial Intelligence , number =
Ragno, Alessio and La Rosa, Biagio and Capobianco, Roberto , doi =. IEEE Transactions on Artificial Intelligence , number =
-
[27]
Random Walk Graph Neural Networks , url =
Giannis Nikolentzos and Michalis Vazirgiannis , bibsource =. Random Walk Graph Neural Networks , url =. Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual , editor =
work page 2020
-
[28]
Concept Bottleneck Models , url =
Pang Wei Koh and Thao Nguyen and Yew Siang Tang and Stephen Mussmann and Emma Pierson and Been Kim and Percy Liang , bibsource =. Concept Bottleneck Models , url =. Proceedings of the 37th International Conference on Machine Learning,
-
[29]
Learning Important Features Through Propagating Activation Differences , url =
Avanti Shrikumar and Peyton Greenside and Anshul Kundaje , bibsource =. Learning Important Features Through Propagating Activation Differences , url =. Proceedings of the 34th International Conference on Machine Learning,
-
[30]
Kenza Amara and Rex Ying and Zitao Zhang and Zhihao Han and Yinan Shan and Ulrik Brandes and Sebastian Schemm and Ce Zhang , journal =. GraphFramEx: Towards Systematic Evaluation of Explainability Methods for Graph Neural Networks , url =
-
[31]
Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks , url =
Xu Zheng and Farhad Shirani and Tianchun Wang and Wei Cheng and Zhuomin Chen and Haifeng Chen and Hua Wei and Dongsheng Luo , bibsource =. Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks , url =. The Twelfth International Conference on Learning Representations,
-
[32]
Evaluating Explainability for Graph Neural Networks , url =
Chirag Agarwal and Owen Queen and Himabindu Lakkaraju and Marinka Zitnik , journal =. Evaluating Explainability for Graph Neural Networks , url =
-
[33]
A Survey on Explainability of Graph Neural Networks , url =
Jaykumar Kakkad and Jaspal Jannu and Kartik Sharma and Charu Aggarwal and Sourav Medya , journal =. A Survey on Explainability of Graph Neural Networks , url =
-
[34]
Improving Subgraph Recognition with Variational Graph Information Bottleneck , url =
Junchi Yu and Jie Cao and Ran He , journal =. Improving Subgraph Recognition with Variational Graph Information Bottleneck , url =
-
[35]
GREASE: Generate Factual and Counterfactual Explanations for GNN-based Recommendations , url =
Ziheng Chen and Fabrizio Silvestri and Jia Wang and Yongfeng Zhang and Zhenhua Huang and Hongshik Ahn and Gabriele Tolomei , journal =. GREASE: Generate Factual and Counterfactual Explanations for GNN-based Recommendations , url =
-
[36]
Robust Counterfactual Explanations on Graph Neural Networks , url =
Mohit Bajaj and Lingyang Chu and Zi Yu Xue and Jian Pei and Lanjun Wang and Peter Cho. Robust Counterfactual Explanations on Graph Neural Networks , url =. Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual , editor =
work page 2021
-
[37]
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism , url =
Siqi Miao and Mia Liu and Pan Li , bibsource =. Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism , url =. International Conference on Machine Learning,
-
[38]
Graph Attention Networks , url =
Petar Velickovic and Guillem Cucurull and Arantxa Casanova and Adriana Romero and Pietro Li. Graph Attention Networks , url =. 6th International Conference on Learning Representations,
-
[39]
Chris Lin and Gerald J. Sun and Krishna C. Bulusu and Jonathan R. Dry and Marylens Hernandez , journal =. Graph Neural Networks Including Sparse Interpretability , url =
-
[40]
arXiv preprint arXiv:2009.07896 , year=
Captum: A unified and generic model interpretability library for PyTorch , year =. arXiv , author =:2009.07896 , primaryclass =
-
[41]
Interpreting Graph Neural Networks for
Michael Sejr Schlichtkrull and Nicola De Cao and Ivan Titov , bibsource =. Interpreting Graph Neural Networks for. 9th International Conference on Learning Representations,
-
[42]
A Survey of the State of Explainable
Danilevsky, Marina and Qian, Kun and Aharonov, Ranit and Katsis, Yannis and Kawas, Ban and Sen, Prithviraj , booktitle =. A Survey of the State of Explainable
-
[43]
Explaining the Explainers in Graph Neural Networks: a Comparative Study , volume=
Longa, Antonio and Azzolin, Steve and Santin, Gabriele and Cencetti, Giulia and Lio, Pietro and Lepri, Bruno and Passerini, Andrea , year=. Explaining the Explainers in Graph Neural Networks: a Comparative Study , volume=. ACM Computing Surveys , publisher=. doi:10.1145/3696444 , number=
-
[44]
Nauta, Meike and Trienes, Jan and Pathak, Shreyasi and Nguyen, Elisa and Peters, Michelle and Schmitt, Yasmin and Schlötterer, Jörg and van Keulen, Maurice and Seifert, Christin , year=. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI , volume=. ACM Computing Surveys , publisher=. doi:10.1145/35...
- [45]
-
[46]
XGNN: Towards Model-Level Explanations of Graph Neural Networks , url=
Yuan, Hao and Tang, Jiliang and Hu, Xia and Ji, Shuiwang , year=. XGNN: Towards Model-Level Explanations of Graph Neural Networks , url=. doi:10.1145/3394486.3403085 , booktitle=
- [47]
-
[48]
Tan, Juntao and Geng, Shijie and Fu, Zuohui and Ge, Yingqiang and Xu, Shuyuan and Li, Yunqi and Zhang, Yongfeng , year=. Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning , url=. doi:10.1145/3485447.3511948 , booktitle=
-
[49]
Generative Causal Explanations for Graph Neural Networks , author=. 2021 , eprint=
work page 2021
-
[50]
CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks , author=. 2022 , eprint=
work page 2022
-
[51]
D4Explainer: In-Distribution GNN Explanations via Discrete Denoising Diffusion , author=. 2023 , eprint=
work page 2023
-
[52]
GStarX: Explaining Graph Neural Networks with Structure-Aware Cooperative Games , author=. 2022 , eprint=
work page 2022
-
[53]
DAG Matters! GFlowNets Enhanced Explainer For Graph Neural Networks , author=. 2023 , eprint=
work page 2023
-
[54]
On Explainability of Graph Neural Networks via Subgraph Explorations , author=. 2021 , eprint=
work page 2021
-
[55]
Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks , author=. 2022 , eprint=
work page 2022
-
[56]
Towards Multi-Grained Explainability for Graph Neural Networks , url =
Wang, Xiang and Wu, Yingxin and Zhang, An and He, Xiangnan and Chua, Tat-Seng , booktitle =. Towards Multi-Grained Explainability for Graph Neural Networks , url =
-
[57]
Reinforced Causal Explainer for Graph Neural Networks , volume=
Wang, Xiang and Wu, Yingxin and Zhang, An and Feng, Fuli and He, Xiangnan and Chua, Tat-Seng , year=. Reinforced Causal Explainer for Graph Neural Networks , volume=. IEEE Transactions on Pattern Analysis and Machine Intelligence , publisher=. doi:10.1109/tpami.2022.3170302 , number=
-
[58]
GNNInterpreter: A Probabilistic Generative Model-Level Explanation for Graph Neural Networks , author=. 2024 , eprint=
work page 2024
-
[59]
Robustness questions the interpretability of graph neural networks: what to do? , author=. 2025 , eprint=
work page 2025
- [60]
-
[61]
Learning to Extend Molecular Scaffolds with Structural Motifs , author=. 2024 , eprint=
work page 2024
-
[62]
S.V.N. Vishwanathan and Nicol N. Schraudolph and Risi Kondor and Karsten M. Borgwardt , title =. Journal of Machine Learning Research , year =
-
[63]
and Kolouri, Soheil and Rostami, Mohammad and Martin, Charles E
Pope, Phillip E. and Kolouri, Soheil and Rostami, Mohammad and Martin, Charles E. and Hoffmann, Heiko , booktitle=. Explainability Methods for Graph Convolutional Neural Networks , year=
-
[64]
Explainability in Graph Neural Networks: A Taxonomic Survey , author=. 2022 , eprint=
work page 2022
- [65]
-
[66]
An Efficient Explanation of Individual Classifications using Game Theory , author=. J. Mach. Learn. Res. , year=
-
[67]
Towards Prototype-Based Self-Explainable Graph Neural Network , author=. 2022 , eprint=
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.