Recognition: unknown
Concept Graph Convolutions: Message Passing in the Concept Space
Pith reviewed 2026-05-10 00:12 UTC · model grok-4.3
The pith
Graph neural networks can perform message passing directly on node concepts to reveal how reasoning evolves across layers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that message passing can be redefined to act on a mixture of raw node features and extracted concepts using structural and attention-based edge weights, so that the evolution of concepts across convolutional steps becomes directly observable while task performance stays competitive with standard graph convolutions.
What carries the argument
The Concept Graph Convolution layer, which integrates node-level concepts with raw features for message passing via structural and attention edge weights to enable process-level interpretability.
If this is right
- Task accuracy remains competitive with conventional graph convolutions.
- Users gain direct visibility into how concepts change and interact across each convolutional step.
- The pure concept variant allows the entire convolution to run inside concept space alone.
- Explanations now cover the message-passing process rather than only final latent states.
Where Pith is reading between the lines
- The concept trajectories could serve as a diagnostic tool to locate where a model's reasoning diverges from expected patterns on specific nodes.
- Similar concept-space message passing might be tested on non-graph architectures that rely on iterative feature mixing.
- If concept evolution proves stable, the layer could support training objectives that explicitly reward coherent concept flow.
Load-bearing premise
Concepts extracted from the network's own latent representations can be meaningfully recombined with raw input features to represent the actual reasoning steps inside the message-passing operation.
What would settle it
Train the layer on a standard benchmark such as Cora or MUTAG and check whether accuracy matches a baseline GCN while the extracted concept trajectories visibly align with the model's final predictions; a clear drop in accuracy or mismatch in concept paths would falsify the claim.
Figures
read the original abstract
The trust in the predictions of Graph Neural Networks is limited by their opaque reasoning process. Prior methods have tried to explain graph networks via concept-based explanations extracted from the latent representations obtained after message passing. However, these explanations fall short of explaining the message passing process itself. To this aim, we propose the Concept Graph Convolution, the first graph convolution designed to operate on node-level concepts for improved interpretability. The proposed convolutional layer performs message passing on a combination of raw and concept representations using structural and attention-based edge weights. We also propose a pure variant of the convolution, only operating in the concept space. Our results show that the Concept Graph Convolution allows to obtain competitive task accuracy, while enabling an increased insight into the evolution of concepts across convolutional steps.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Concept Graph Convolutions (CGC), a new graph convolution layer that performs message passing by combining raw node features with node-level concepts extracted from latent representations. The layer employs both structural edge weights and attention-based weights; a pure-concept variant operating solely in concept space is also introduced. The central claims are that CGC achieves competitive task accuracy on graph benchmarks while providing direct insight into the evolution of concepts across successive convolutional steps, addressing the limitation that prior concept-based explanations only interpret post-message-passing latents.
Significance. If the empirical claims hold, the work offers a meaningful step toward intrinsically interpretable GNNs by embedding concept-based reasoning inside the message-passing operation itself rather than relying on post-hoc analysis. The dual hybrid/pure-concept design is a constructive feature that permits controlled examination of the necessity of raw features. The approach is grounded in standard GNN machinery and does not introduce obvious circularity or unstated assumptions that would invalidate the construction on its face.
major comments (2)
- [§3.2] §3.2 (Concept extraction procedure): the stability of the node-level concepts extracted from latent representations is load-bearing for the interpretability claim; without quantitative evidence (e.g., concept consistency across random seeds or layers) it remains unclear whether observed evolution reflects genuine dynamics or extraction artifacts.
- [§4.2] §4.2 (Experimental results): the reported competitive accuracy must be supported by direct comparison against the same backbone GNN without the concept layer (i.e., an ablation that isolates the effect of the proposed convolution); the current presentation leaves open whether gains are attributable to the concept mechanism or to other architectural choices.
minor comments (3)
- [§3.1] Notation for the attention-based edge weights and the structural weights should be unified and introduced before the first equation that uses them.
- [Figure 3] Figure 3 (concept evolution visualization) would benefit from an explicit legend indicating which colors correspond to which extracted concepts and from reporting the number of concepts used.
- [Abstract / Introduction] The abstract states 'competitive task accuracy' without naming the datasets or baselines; this should be expanded in the introduction for immediate context.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which will help improve the clarity and rigor of our work. We address each major comment below.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Concept extraction procedure): the stability of the node-level concepts extracted from latent representations is load-bearing for the interpretability claim; without quantitative evidence (e.g., concept consistency across random seeds or layers) it remains unclear whether observed evolution reflects genuine dynamics or extraction artifacts.
Authors: We concur that stability analysis is crucial to substantiate the interpretability benefits. Although the current manuscript focuses on demonstrating the evolution of concepts through visualizations, we will augment the revised version with quantitative metrics for concept stability. Specifically, we will report concept consistency scores across different random seeds and layers using measures such as cosine similarity between concept vectors or adjusted Rand index for clustering stability. This will provide evidence that the observed dynamics are robust rather than extraction artifacts. revision: yes
-
Referee: [§4.2] §4.2 (Experimental results): the reported competitive accuracy must be supported by direct comparison against the same backbone GNN without the concept layer (i.e., an ablation that isolates the effect of the proposed convolution); the current presentation leaves open whether gains are attributable to the concept mechanism or to other architectural choices.
Authors: We appreciate this point on isolating the contribution of the concept mechanism. The experiments in the manuscript compare CGC against standard GNNs and other concept-based methods, but to directly address this, we will include an ablation study in the revision. This will involve running the exact same backbone network (e.g., the GCN or GAT architecture used) but replacing the CGC layers with standard convolution layers, allowing a direct comparison of task performance with and without the concept-based message passing. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The manuscript introduces Concept Graph Convolution as an architectural design choice that mixes raw node features with extracted concepts via structural and attention weights during message passing. This construction is defined directly by the proposed layer rather than derived from fitted parameters, self-referential equations, or prior self-citations that would force the outcome. Competitive task accuracy and insight into concept evolution are presented as empirical results of the new layer, not as quantities that reduce to the inputs by construction. No load-bearing step matches the enumerated circularity patterns; the approach remains self-contained as a novel GNN variant.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Node-level concepts can be reliably extracted from latent representations after message passing
invented entities (1)
-
Concept Graph Convolution layer
no independent evidence
Reference graph
Works this paper leans on
-
[1]
The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008
2008
-
[2]
Mauparna Nandan, Soma Mitra, and Debashis De. Graphxai: a survey of graph neural networks (gnns) for explainable ai (xai).Neural Computing and Applications, 37(17):10949–11000, 2025. URLhttps://doi.org/10.1007/s00521-025-11054-3
-
[3]
Graph neural networks.Nature Reviews Methods Primers, 4(1):17, 2024
Gabriele Corso, Hannes Stark, Stefanie Jegelka, Tommi Jaakkola, and Regina Barzilay. Graph neural networks.Nature Reviews Methods Primers, 4(1):17, 2024
2024
-
[4]
A comprehensive survey on self-interpretable neural networks.Proceedings of the IEEE, 2025
Yang Ji, Ying Sun, Yuting Zhang, Zhigaoyuan Wang, Yuanxin Zhuang, Zheng Gong, Dazhong Shen, Chuan Qin, Hengshu Zhu, and Hui Xiong. A comprehensive survey on self-interpretable neural networks.Proceedings of the IEEE, 2025
2025
-
[5]
Lucie Charlotte Magister, Dmitry Kazhdan, Vikash Singh, and Pietro Liò. Gcexplainer: Human-in- the-loop concept-based explanations for graph neural networks.arXiv preprint arXiv:2107.11889, 2021
-
[6]
Concept distillation in graph neural networks
Lucie Charlotte Magister, Pietro Barbiero, Dmitry Kazhdan, Federico Siciliano, Gabriele Ciravegna, Fabrizio Silvestri, Mateja Jamnik, and Pietro Liò. Concept distillation in graph neural networks. In World Conference on Explainable Artificial Intelligence, pages 233–255. Springer, 2023
2023
-
[7]
Global explain- ability of gnns via logic combination of learned concepts
Steve Azzolin, Antonio Longa, Pietro Barbiero, Pietro Liò, and Andrea Passerini. Global explain- ability of gnns via logic combination of learned concepts. InInternational Conference on Learning Representations. PMLR, 2023
2023
-
[8]
Beyond message passing: A symbolic alternative for expressive and interpretable graph learning
Chuqin Geng, Li Zhang, Haolin Ye, Ziyu Zhao, Yuhe Jiang, Tara Saba, Xinyu Wang, and Xujie Si. Beyond message passing: A symbolic alternative for expressive and interpretable graph learning. arXiv preprint arXiv:2602.16947, 2026
-
[9]
Shaked Brody, Uri Alon, and Eran Yahav. How attentive are graph attention networks?arXiv preprint arXiv:2105.14491, 2021
-
[10]
Interpretable and generalizable graph learning via stochastic attention mechanism
Siqi Miao, Mia Liu, and Pan Li. Interpretable and generalizable graph learning via stochastic attention mechanism. InInternational conference on machine learning, pages 15524–15543. PMLR, 2022
2022
-
[11]
Graph Attention Networks.International Conference on Learning Representations, 2018
Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph Attention Networks.International Conference on Learning Representations, 2018. URLhttps://openreview.net/forum?id=rJXMpikCZ
2018
-
[12]
Semi-Supervised Classification with Graph Convolutional Networks
Thomas N. Kipf and Max Welling. Semi-Supervised Classification with Graph Convolutional Networks. In5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, pages 11313–11320. International Conference on Learning Representations, ICLR, sep 2016. URLhttp://arxiv.org/abs/1609.02907
work page internal anchor Pith review arXiv 2017
-
[13]
Hamilton, Rex Ying, and Jure Leskovec
William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. InProceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY , USA, 2017. Curran Associates Inc. ISBN 9781510860964
2017
-
[14]
How Powerful are Graph Neural Networks?
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826, 2018
work page internal anchor Pith review arXiv 2018
-
[15]
Convolutional neural networks on graphs with fast localized spectral filtering
Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. InProceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 3844–3852, Red Hook, NY , USA, 2016. Curran Associates Inc. ISBN 9781510838819
2016
-
[16]
arXiv preprint arXiv:1803.03735 , year=
Kiran K Thekumparampil, Chong Wang, Sewoong Oh, and Li-Jia Li. Attention-based graph neural network for semi-supervised learning.arXiv preprint arXiv:1803.03735, 2018. 10
- [17]
-
[18]
Sim- plifying graph convolutional networks
Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. Sim- plifying graph convolutional networks. InInternational conference on machine learning, pages 6861–6871. Pmlr, 2019
2019
-
[19]
Revisitingover-smoothingindeepgcns
Chaoqi Yang, Ruijie Wang, Shuochao Yao, Shengzhong Liu, and Tarek Abdelzaher. Revisiting over-smoothing in deep gcns.arXiv preprint arXiv:2003.13663, 2020
-
[20]
Relational concept bottleneck models.Advances in Neural Information Processing Systems, 37:77663–77685, 2024
Pietro Barbiero, Francesco Giannini, Gabriele Ciravegna, Michelangelo Diligenti, and Giuseppe Marra. Relational concept bottleneck models.Advances in Neural Information Processing Systems, 37:77663–77685, 2024
2024
-
[21]
Causal concept graph models: Beyond causal opacity in deep learning
Gabriele Dominici, Pietro Barbiero, Mateo Espinosa Zarlenga, Alberto Termine, Martin Gjoreski, Giuseppe Marra, and Marc Langheinrich. Causal concept graph models: Beyond causal opacity in deep learning. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025, 2025
2025
-
[22]
Is attention explanation? an introduction to the debate
Adrien Bibal, Rémi Cardon, David Alfter, Rodrigo Wilkens, Xiaoou Wang, Thomas François, and Patrick Watrin. Is attention explanation? an introduction to the debate. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (volume 1: long papers), pages 3889–3900, 2022
2022
-
[23]
Attention is not explanation
Sarthak Jain and Byron C Wallace. Attention is not explanation. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3543–3556, 2019
2019
-
[24]
Attention is not not explanation
Sarah Wiegreffe and Yuval Pinter. Attention is not not explanation. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pages 11–20, 2019
2019
-
[25]
Why attentions may not be interpretable? InProceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 25–34, 2021
Bing Bai, Jian Liang, Guanhua Zhang, Hao Li, Kun Bai, and Fei Wang. Why attentions may not be interpretable? InProceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 25–34, 2021
2021
-
[26]
Attention cannot be an explanation.arXiv preprint arXiv:2201.11194, 2022
Arjun R Akula and Song-Chun Zhu. Attention cannot be an explanation.arXiv preprint arXiv:2201.11194, 2022
-
[27]
Ridge regression.Wiley Interdisciplinary Reviews: Computational Statistics, 1 (1):93–100, 2009
Gary C McDonald. Ridge regression.Wiley Interdisciplinary Reviews: Computational Statistics, 1 (1):93–100, 2009
2009
-
[28]
Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c
Michael M. Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges, 2021. URL https://arxiv.org/abs/2104. 13478
2021
-
[29]
On Completeness-aware Concept-Based Explanations in Deep Neural Net- works
Chih-Kuan Yeh, Been Kim, Sercan Ö Arık, Chun-Liang Li, Tomas Pfister, and Pradeep Ravikumar. On Completeness-aware Concept-Based Explanations in Deep Neural Net- works. InAdvances in Neural Information Processing Systems 33 (NeurIPS 2020), vol- ume 33, pages 20554–20565, 2020. URL https://papers.nips.cc/paper/2020/hash/ ecb287ff763c169694f682af52c1f309-Ab...
2020
-
[30]
Wadsworth, 1984
L Breiman, JH Friedman, R Olshen, and CJ Stone.Classification and Regression Trees. Wadsworth, 1984
1984
-
[31]
Gnnexplainer: Generating explanations for graph neural networks.Advances in neural information processing systems, 32, 2019
Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. Gnnexplainer: Generating explanations for graph neural networks.Advances in neural information processing systems, 32, 2019
2019
-
[32]
Parameterized explainer for graph neural network.Advances in neural information processing systems, 33:19620–19631, 2020
Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, and Xiang Zhang. Parameterized explainer for graph neural network.Advances in neural information processing systems, 33:19620–19631, 2020
2020
-
[33]
Pgm-explainer: Probabilistic graphical model explanations for graph neural networks.Advances in neural information processing systems, 33:12225–12235, 2020
Minh Vu and My T Thai. Pgm-explainer: Probabilistic graphical model explanations for graph neural networks.Advances in neural information processing systems, 33:12225–12235, 2020. 11
2020
-
[34]
Global concept-based interpretability for graph neural networks via neuron analysis
Han Xuanyuan, Pietro Barbiero, Dobrik Georgiev, Lucie Charlotte Magister, and Pietro Liò. Global concept-based interpretability for graph neural networks via neuron analysis. InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 10675–10683, 2023
2023
-
[35]
Explaining the explainers in graph neural networks: a comparative study.ACM Computing Surveys, 57(5):1–37, 2025
Antonio Longa, Steve Azzolin, Gabriele Santin, Giulia Cencetti, Pietro Lio, Bruno Lepri, and Andrea Passerini. Explaining the explainers in graph neural networks: a comparative study.ACM Computing Surveys, 57(5):1–37, 2025
2025
-
[36]
Emergence of scaling in random networks.science, 286 (5439):509–512, 1999
Albert-László Barabási and Réka Albert. Emergence of scaling in random networks.science, 286 (5439):509–512, 1999
1999
-
[37]
Erdos and Alfréd Rényi
Paul L. Erdos and Alfréd Rényi. On the evolution of random graphs.Transactions of the Amer- ican Mathematical Society, 286:257–257, 1984. URL https://api.semanticscholar.org/ CorpusID:6829589
1984
-
[38]
Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. Tudataset: A collection of benchmark datasets for learning with graphs.arXiv preprint arXiv:2007.08663, 2020
-
[39]
Protgnn: Towards self- explaining graph neural networks
Zaixi Zhang, Qi Liu, Hao Wang, Chengqiang Lu, and Cheekong Lee. Protgnn: Towards self- explaining graph neural networks. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 9127–9135, 2022
2022
-
[40]
Megan: Multi-explanation graph attention network.arXiv preprint arXiv:2211.13236, 2022
Jonas Teufel, Luca Torresi, Patrick Reiser, and Pascal Friederich. Megan: Multi-explanation graph attention network.arXiv preprint arXiv:2211.13236, 2022
-
[41]
Explaining gnn explanations with edge gradients
Jesse He, Akbar Rafiey, Gal Mishne, and Yusu Wang. Explaining gnn explanations with edge gradients. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, pages 884–895, 2025
2025
-
[42]
Efficient gnn explanation via learning removal-based attribution.Acm Transactions on Knowledge Discovery from Data, 19(2):1–23, 2025
Yao Rong, Guanchu Wang, Qizhang Feng, Ninghao Liu, Zirui Liu, Enkelejda Kasneci, and Xia Hu. Efficient gnn explanation via learning removal-based attribution.Acm Transactions on Knowledge Discovery from Data, 19(2):1–23, 2025
2025
-
[43]
Concept embedding models: beyond the accuracy-explainability trade-off
Mateo Espinosa Zarlenga, Pietro Barbiero, Gabriele Ciravegna, Giuseppe Marra, Francesco Gian- nini, Michelangelo Diligenti, Zohreh Shams, Frederic Precioso, Stefano Melacci, Adrian Weller, Pietro Lio, and Mateja Jamnik. Concept embedding models: beyond the accuracy-explainability trade-off. InProceedings of the 36th International Conference on Neural Info...
2024
-
[44]
Stochastic modified equations and adaptive stochastic gradi- ent algorithms
Qianxiao Li, Cheng Tai, and Weinan E. Stochastic modified equations and adaptive stochastic gradi- ent algorithms. In Doina Precup and Yee Whye Teh, editors,Proceedings of the 34th International Conference on Machine Learning, volume 70 ofProceedings of Machine Learning Research, pages 2101–2110. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70...
2017
-
[45]
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library.arXiv preprint arXiv:1912.01703, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1912
-
[46]
Learning deep features for discriminative localization
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. Learning deep features for discriminative localization. In2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2921–2929, 2016. doi:10.1109/CVPR.2016.319
-
[47]
Selvaraju, Michael Cogswell, Ab- hishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localiza- tion. In2017 IEEE International Conference on Computer Vision (ICCV), pages 618–626, 2017. doi:10.1109/ICCV .2017.74
-
[48]
Parniyan Farvardin and David Chapman. End-to-end feature alignment: A simple cnn with intrinsic class attribution.arXiv preprint arXiv:2603.25798, 2026
-
[49]
Interactive exploration of cnn interpretability via coalitional game theory.Scientific Reports, 15(1):9261, 2025
Lei Yang, Lingmeng Lu, Chao Liu, Jian Zhang, Kehua Guo, Ning Zhang, Fangfang Zhou, and Ying Zhao. Interactive exploration of cnn interpretability via coalitional game theory.Scientific Reports, 15(1):9261, 2025. 12 A Dataset statistics Table 3 collates different statistics on the node and graph classification datasets used, such as the number of graphs, a...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.