pith. sign in

arxiv: 2604.20577 · v2 · submitted 2026-04-22 · 💻 cs.SE · cs.LG

Evaluating Assurance Cases as Text-Attributed Graphs for Structure and Provenance Analysis

Pith reviewed 2026-05-09 23:48 UTC · model grok-4.3

classification 💻 cs.SE cs.LG
keywords assurance casesgraph neural networkslink predictionprovenance detectionLLM-generatedsafety argumentstext-attributed graphs
0
0 comments X

The pith

Assurance cases modeled as text-attributed graphs let graph neural networks predict argument links and detect LLM authorship.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper converts assurance cases into graphs where nodes hold text for claims and evidence while edges show relationships. Graph neural networks then learn to predict missing links between these elements and to classify entire cases as human-written or generated by large language models. This approach works because the graph structure captures how arguments connect in hierarchies. Experiments report solid results on both tasks and note that LLM cases tend to follow different linking patterns than human ones. The work supplies a public dataset to support further study of structure and origin in safety documents.

Core claim

Assurance cases are turned into text-attributed graphs so that graph neural networks can perform link prediction at ROC-AUC 0.760 on real cases while generalizing across domains and semi-supervised regimes. The same models classify human-authored versus LLM-generated cases at F1 0.94 and expose distinct hierarchical linking patterns in the LLM outputs. Existing GNN explanation methods align only moderately with the true argument structure.

What carries the argument

Text-attributed graphs of assurance cases, with nodes carrying text descriptions of argument elements and edges encoding relationships, fed to graph neural networks for link prediction and provenance classification.

If this is right

  • Link prediction models can help complete or audit the logical connections inside assurance cases.
  • Provenance classification can flag LLM-generated cases for extra human review in regulated safety work.
  • The observed difference in hierarchical patterns indicates current LLMs produce more uniform argument structures.
  • Moderate faithfulness of GNN explanations shows a remaining gap between model reasoning and actual argument flow.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same graph treatment could be tried on other regulated documents such as legal or medical arguments to spot AI assistance.
  • Prompt designers might use the linking-pattern difference to make future LLM outputs closer to human variety.
  • If adopted at scale the method could support standards that require visible human oversight of AI-drafted safety cases.

Load-bearing premise

Turning assurance cases into graphs with text on nodes keeps the essential semantic links and origin signals intact.

What would settle it

A new collection of LLM-generated assurance cases deliberately prompted to copy human hierarchical linking statistics, tested to see whether classification F1 drops well below 0.94.

Figures

Figures reproduced from arXiv: 2604.20577 by Dusica Marijan, Fariz Ikhwantri.

Figure 1
Figure 1. Figure 1: Comparison of human-authored and LLM￾generated assurance cases. The LLM-generated case (right) shows a different hierarchical linking pattern and node dis￾tribution compared to the human-authored ground truth (left). Different colours represent heterogeneous node types (e.g., Goal, Strategy, Evidence) in the GSN notation. These assurance cases typically comprise a hierarchical argu￾ment structure, consisti… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of graph evaluation framework on Assur [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Node Importance Distribution by Type. Importance scores are obtained by GNNExplainer attributions from UniGraph [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: GNNExplainer output for UniGraph model showing [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

An assurance case is a structured argument document that justifies claims about a system's requirements or properties, which are supported by evidence. In regulated domains, these are crucial for meeting compliance and safety requirements to industry standards. We propose a graph diagnostic framework for analysing the structure and provenance of assurance cases. We focus on two main tasks: (1) link prediction, to learn and identify connections between argument elements, and (2) graph classification, to differentiate between assurance cases created by a state-of-the-art large language model and those created by humans, aiming to detect bias. We compiled a publicly available dataset of assurance cases, represented as graphs with nodes and edges, supporting both link prediction and provenance analysis. Experiments show that graph neural networks (GNNs) achieve strong link prediction performance (ROC-AUC 0.760) on real assurance cases and generalise well across domains and semi-supervised settings. For provenance detection, GNNs effectively distinguish human-authored from LLM-generated cases (F1 0.94). We observed that LLM-generated assurance cases have different hierarchical linking patterns compared to human-authored cases. Furthermore, existing GNN explanation methods show only moderate faithfulness, revealing a gap between predicted reasoning and the true argument structure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes representing assurance cases as text-attributed graphs to support structure and provenance analysis via graph neural networks. It introduces a public dataset and evaluates two tasks: link prediction on real assurance cases (ROC-AUC 0.76, with generalization across domains and semi-supervised settings) and binary graph classification to distinguish human-authored from LLM-generated cases (F1 0.94). The authors report that LLM-generated cases exhibit different hierarchical linking patterns and that existing GNN explanation methods show only moderate faithfulness to the underlying argument structure.

Significance. If the central empirical claims hold after addressing methodological gaps, the work would be significant for regulated software engineering domains where assurance cases are mandatory. It provides an automated, graph-based approach to detect potential provenance issues and structural differences, along with a publicly available dataset that could enable further research. The reported generalization and the observation of linking pattern differences are strengths, though their reliability hinges on unbiased dataset construction.

major comments (3)
  1. [Dataset compilation] Dataset construction (Section on dataset compilation): The process for creating the LLM-generated assurance cases is described at too high a level. Specifics on the LLM model, prompt templates, temperature, sampling strategy, and any post-processing or filtering steps are absent. This is load-bearing for the provenance classification result (F1 0.94), because the GNN could be learning synthesis artifacts (e.g., shallower or more uniform trees) rather than genuine human vs. LLM structural differences.
  2. [Representation as text-attributed graphs] Graph construction details (Section on representation as text-attributed graphs): The conversion of assurance cases into nodes, edges, and text attributes is not specified in sufficient detail (e.g., how implicit links are encoded, what text is attached to nodes/edges, embedding method, or handling of hierarchical vs. cross-reference edges). Without this, it is impossible to assess whether the reported link-prediction ROC-AUC of 0.76 and classification performance truly reflect preserved semantic and provenance signals.
  3. [Experiments] Experimental evaluation (Experiments section): The manuscript reports concrete performance numbers but omits baselines, statistical significance tests, error analysis, dataset size/split details, and ablation studies. These omissions make it difficult to verify the claims of strong performance and cross-domain generalization for both tasks.
minor comments (2)
  1. [Abstract] The abstract states that GNNs 'generalise well across domains' but provides no quantitative breakdown by domain or explicit list of domains represented in the dataset.
  2. Notation for graph components (nodes, edges, attributes) and the precise definition of 'hierarchical linking patterns' could be introduced earlier and used consistently to improve readability for readers outside the GNN community.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback on our manuscript. We believe the suggested clarifications will improve the paper's reproducibility and clarity. We address each major comment below.

read point-by-point responses
  1. Referee: [Dataset compilation] Dataset construction (Section on dataset compilation): The process for creating the LLM-generated assurance cases is described at too high a level. Specifics on the LLM model, prompt templates, temperature, sampling strategy, and any post-processing or filtering steps are absent. This is load-bearing for the provenance classification result (F1 0.94), because the GNN could be learning synthesis artifacts (e.g., shallower or more uniform trees) rather than genuine human vs. LLM structural differences.

    Authors: We concur that greater specificity is required here to substantiate the provenance classification results and mitigate concerns over potential artifacts. Accordingly, the revised manuscript will provide comprehensive details on the LLM generation process, including the model (GPT-4), complete prompt templates (to be included in an appendix), temperature (set to 0.7), sampling method (top-p sampling with p=0.9), and post-processing (automatic filtering for valid JSON structure followed by manual review for argument coherence). We will also include comparative statistics on graph properties such as average depth and branching factor to demonstrate that observed differences reflect substantive variations rather than superficial synthesis traits. revision: yes

  2. Referee: [Representation as text-attributed graphs] Graph construction details (Section on representation as text-attributed graphs): The conversion of assurance cases into nodes, edges, and text attributes is not specified in sufficient detail (e.g., how implicit links are encoded, what text is attached to nodes/edges, embedding method, or handling of hierarchical vs. cross-reference edges). Without this, it is impossible to assess whether the reported link-prediction ROC-AUC of 0.76 and classification performance truly reflect preserved semantic and provenance signals.

    Authors: We appreciate this observation and will enhance the representation section with precise specifications. Nodes correspond to individual argument elements (e.g., claims, strategies, evidence), each attributed with its original textual content. Edges encode 'supportedBy' relations for the hierarchical argument structure and 'inContextOf' for cross-references. Implicit links are derived directly from the assurance case's documented relationships. Text attributes are embedded using the all-MiniLM-L6-v2 sentence transformer model. The revised text will include a step-by-step description and pseudocode for the graph construction pipeline, distinguishing hierarchical from cross-reference edges. revision: yes

  3. Referee: [Experiments] Experimental evaluation (Experiments section): The manuscript reports concrete performance numbers but omits baselines, statistical significance tests, error analysis, dataset size/split details, and ablation studies. These omissions make it difficult to verify the claims of strong performance and cross-domain generalization for both tasks.

    Authors: We agree that these methodological details are important for validating our empirical claims. In the updated Experiments section, we will incorporate: baseline methods such as random guessing, feature-based classifiers without graph structure, and traditional ML models; statistical significance testing using bootstrap resampling or t-tests with reported p-values; qualitative error analysis highlighting representative failure cases for both tasks; explicit dataset statistics including the number of graphs (120 human-authored and 120 LLM-generated for classification, with domain-specific counts), and the train/validation/test splits (70%/15%/15%); and ablation experiments varying the use of text embeddings and edge types. New tables and figures will present these results to support the reported performance and generalization. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML evaluation on compiled dataset

full rationale

The paper describes compiling a dataset of assurance cases represented as text-attributed graphs, then applying GNNs to perform link prediction (ROC-AUC 0.760) and binary classification between human and LLM-generated cases (F1 0.94). All reported results are standard experimental metrics obtained by training models on external data splits and evaluating on held-out examples. No equations, self-definitional relations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided abstract or description. The derivation chain consists of data construction followed by independent model training and evaluation, with no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard assumptions of graph representation learning and the premise that assurance-case text can be faithfully converted to attributed graphs without loss of argument semantics. No new entities are postulated.

axioms (2)
  • domain assumption Assurance cases can be losslessly represented as text-attributed graphs where nodes are argument elements and edges capture logical connections.
    Invoked in the proposal of the graph diagnostic framework and dataset construction.
  • domain assumption GNNs trained on these graphs can generalize across domains and in semi-supervised settings.
    Stated as an experimental outcome but treated as a modeling assumption for the framework.

pith-pipeline@v0.9.0 · 5517 in / 1488 out tokens · 37652 ms · 2026-05-09T23:48:06.568203+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 3 internal anchors

  1. [1]

    Adelard. 2024. Claims-Arguments-Evidence (CAE). https://www.adelard.com/ asce/cae. Accessed July 2025

  2. [2]

    Chirag Agarwal, Owen Queen, Himabindu Lakkaraju, and Marinka Zitnik. 2023. Evaluating explainability for graph neural networks.Scientific Data10, 1 (2023), 144

  3. [3]

    Ankit Agrawal, Seyedehzahra Khoshmanesh, Michael Vierhauser, Mona Rahimi, Jane Cleland-Huang, and Robyn Lutz. 2019. Leveraging artifact trees to evolve and reuse safety cases. InProceedings of the 41st International Conference on Software Engineering(Montreal, Quebec, Canada)(ICSE ’19). IEEE Press, New York, NY, USA, 1222–1233. doi:10.1109/ICSE.2019.00124...

  4. [4]

    Alexander Ahlbrecht, Jasper Sprockhoff, and Umut Durak. 2024. A system- theoretic assurance framework for safety-driven systems engineering: A System- Theoretic Assurance Framework...Softw. Syst. Model.24, 1 (Sept. 2024), 253–270. doi:10.1007/s10270-024-01209-6

  5. [5]

    Kenza Amara, Rex Ying, Zitao Zhang, Zhihao Han, Yinan Shan, Ulrik Bran- des, Sebastian Schemm, and Ce Zhang. 2024. GraphFramEx: Towards Sys- tematic Evaluation of Explainability Methods for Graph Neural Networks. arXiv:2206.09677 [cs.LG] https://arxiv.org/abs/2206.09677

  6. [6]

    Ewen Denney, Ganesh Pai, and Ibrahim Habli. 2015. Dynamic safety cases for through-life safety assurance. InProceedings of the 37th International Conference on Software Engineering - Volume 2(Florence, Italy)(ICSE ’15). IEEE Press, New York, NY, USA, 587–590

  7. [7]

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jill Burstein, Christy...

  8. [8]

    Romina Etezadi, Sallam Abualhaija, Chetan Arora, and Lionel Briand. 2025. Classification or Prompting: A Case Study on Legal Requirements Traceabil- ity. arXiv:2502.04916 [cs.SE] https://arxiv.org/abs/2502.04916

  9. [9]

    Matthias Fey, Jinu Sunil, Akihiro Nitta, Rishi Puri, Manan Shah, Blaž Stojanovič, Ramona Bendias, Alexandria Barghi, Vid Kocijan, Zecheng Zhang, et al . 2025. Pyg 2.0: Scalable learning on real world graphs, In Temporal Graph Learning Workshop @ KDD.arXiv e-prints, arXiv–2507

  10. [10]

    Schoenholz, Patrick F

    Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural message passing for Quantum chemistry. InProceedings of the 34th International Conference on Machine Learning - Volume 70(Sydney, NSW, Australia)(ICML’17). JMLR.org, 1263–1272

  11. [11]

    Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. 2024. The llama 3 herd of models.arXiv preprint arXiv:2407.21783 (2024)

  12. [12]

    Provider of Services for Urban Air Mobility (PSU) Prototype Simulation (X5) Final Report,

    Mallory S. Graydon and Sarah M. Lehman. 2025.Examining Proposed Uses of LLMs to Produce or Assess Assurance Arguments. Technical Memorandum (TM) 20250001849. Langley Research Center, NASA. https://ntrs.nasa.gov/api/ citations/20250001849/downloads/NASA-TM-20250001849.pdf Public release; work of the U.S. Government

  13. [13]

    Hamilton, Rex Ying, and Jure Leskovec

    William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. InProceedings of the 31st International Conference on Neural Information Processing Systems(Long Beach, California, USA)(NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 1025–1035

  14. [14]

    Yufei He, Yuan Sui, Xiaoxin He, and Bryan Hooi. 2025. UniGraph: Learning a Uni- fied Cross-Domain Foundation Model for Text-Attributed Graphs. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1 (Toronto ON, Canada)(KDD ’25). Association for Computing Machinery, New York, NY, USA, 448–459. doi:10.1145/3690624.3709277

  15. [15]

    Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, and Liang Zhao. 2025. GRAG: Graph Retrieval-Augmented Generation. InFindings of the Association for Computational Linguistics: NAACL 2025, Luis Chiruzzo, Alan Ritter, and Lu Wang (Eds.). Association for Computational Linguistics, Albuquerque, New Mexico, 4145–4157. https://aclanthology.org/2025.findi...

  16. [16]

    Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. 2024. Gpt-4o system card.arXiv preprint arXiv:2410.21276(2024)

  17. [17]

    Fariz Ikhwantri and Dusica Marijan. 2025. Explainable Compliance Detec- tion with Multi-Hop Natural Language Inference on Assurance Case Structure. arXiv:2506.08713 [cs.CL]

  18. [18]

    Bowen Jin, Gang Liu, Chi Han, Meng Jiang, Heng Ji, and Jiawei Han. 2024. Large Language Models on Graphs: A Comprehensive Survey.IEEE Transactions on Knowledge and Data Engineering36, 12 (2024), 8622–8642. doi:10.1109/TKDE. 2024.3469578

  19. [19]

    Tim Kelly and Rob Weaver. 2004. The goal structuring notation–a safety argument notation. InProceedings of the dependable systems and networks 2004 workshop on assurance cases, Vol. 6. Citeseer Princeton, NJ

  20. [20]

    1999.Arguing safety: a systematic approach to managing safety cases

    Timothy Patrick Kelly et al . 1999.Arguing safety: a systematic approach to managing safety cases. Ph. D. Dissertation. University of York York, UK

  21. [21]

    Kipf and Max Welling

    Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJU4ayYgl

  22. [22]

    Zemin Liu, Xingtong Yu, Yuan Fang, and Xinming Zhang. 2023. GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks. In Proceedings of the ACM Web Conference 2023(Austin, TX, USA)(WWW ’23). Association for Computing Machinery, New York, NY, USA, 417–428. doi:10. 1145/3543507.3583386

  23. [23]

    Yihan Ma, Zhikun Zhang, Ning Yu, Xinlei He, Michael Backes, Yun Shen, and Yang Zhang. 2023. Generated graph detection. InInternational Conference on Machine Learning. PMLR, 23412–23428

  24. [24]

    Mazen Mohamad, Jan-Philipp Steghöfer, and Riccardo Scandariato. 2021. Security assurance cases—state of the art of an emerging approach.Empirical software engineering26, 4 (2021), 70

  25. [25]

    Anitha Murugesan, Isaac Wong, Joaquín Arias, Robert Stroud, Srivatsan Varadara- jan, Elmer Salazar, Gopal Gupta, Robin Bloomfield, and John Rushby. 2024. Au- tomating semantic analysis of system assurance cases using goal-directed ASP. Theory and Practice of Logic Programming24, 4 (2024), 805–824

  26. [26]

    Joakim Nivre. 2010. Dependency parsing.Language and Linguistics Compass4, 3 (2010), 138–152

  27. [27]

    Belle, Song Wang, Segla Kpodjedo, Timothy C

    Oluwafemi Odu, Alvine B. Belle, Song Wang, Segla Kpodjedo, Timothy C. Leth- bridge, and Hadi Hemmati. 2025. Automatic instantiation of assurance cases from patterns using large language models.Journal of Systems and Software222 (2025), 112353. doi:10.1016/j.jss.2025.112353

  28. [28]

    Ross, Mark Winstead, and Michael McEvilley

    Ronald S. Ross, Mark Winstead, and Michael McEvilley. 2022. Engineering Trustworthy Secure Systems. doi:10.6028/NIST.SP.800-160v1r1

  29. [29]

    2015.Un- derstanding and evaluating assurance cases

    John Rushby, Xidong Xu, Murali Rangarajan, and Thomas L Weaver. 2015.Un- derstanding and evaluating assurance cases. Technical Report. Langley Research Center, National Aeronautics and Space Administration

  30. [30]

    Mithila Sivakumar, Alvine Boaye Belle, Jinjun Shan, and Kimya Khakzad Sha- handashti. 2023. GPT-4 and Safety Case Generation: An Exploratory Analysis. arXiv:2312.05696 [cs.SE] https://arxiv.org/abs/2312.05696

  31. [31]

    Mithila Sivakumar, Alvine B Belle, Jinjun Shan, and Kimya Khakzad Shahandashti

  32. [32]

    Prompting GPT–4 to support automatic safety case generation.Expert Systems with Applications255 (2024), 124653

  33. [33]

    Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview. net/forum?id=rJXMpikCZ

  34. [34]

    Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, and Tat-Seng Chua. 2019. KGAT: Knowledge Graph Attention Network for Recommendation. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(Anchorage, AK, USA)(KDD ’19). Association for Computing Machinery, New York, NY, USA, 950–958. doi:10.1145/3292500.3330989

  35. [35]

    Francis Rhys Ward and Ibrahim Habli. 2020. An assurance case pattern for the interpretability of machine learning in safety-critical systems. InInternational Conference on Computer Safety, Reliability, and Security. Springer, Springer, 395– 407

  36. [36]

    Weinstock, Howard F

    Charles B. Weinstock, Howard F. Lipson, and John B. Goodenough. 2007.Arguing Security: Creating Security Assurance Cases. Technical Report. Software Engi- neering Institute, Carnegie Mellon University. https://www.sei.cmu.edu/library/ arguing-security-creating-security-assurance-cases/ Accessed: 2025-10-09

  37. [37]

    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, et al. 2020. Transformers: State-of-the-Art Natural Language Processing. InPro- ceedings of the 2020 Conference on Empirical Methods in Natural Language Process- ing: System Demonstrations, Qun Liu and David Schlangen (Eds.). Association for Computational Linguistics, Online, 38–...

  38. [38]

    Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S Yu. 2020. A comprehensive survey on graph neural networks.IEEE transactions on neural networks and learning systems32, 1 (2020), 4–24

  39. [39]

    An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Cheng- peng Li, Chengyuan Li, Dayiheng Liu, et al. 2024. Qwen2 Technical Report.arXiv preprint arXiv:2407.10671(2024)

  40. [40]

    Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. InProceedings of the Thirty-Third AAAI Conference on Arti- ficial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence(Honolulu, Hawaii, USA)...

  41. [41]

    Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, 974–983

  42. [42]

    Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec

  43. [43]

    Gnnexplainer: Generating explanations for graph neural networks.Ad- vances in neural information processing systems32 (2019)

  44. [44]

    Marinka Zitnik, Monica Agrawal, and Jure Leskovec. 2018. Modeling polyphar- macy side effects with graph convolutional networks.Bioinformatics34, 13 (06 2018), i457–i466. doi:10.1093/bioinformatics/bty294