pith. sign in

arxiv: 2603.04545 · v2 · submitted 2026-03-04 · 💻 cs.LG · cs.DB

An LLM-Guided Query-Aware Inference System for GNN Models on Large Knowledge Graphs

Pith reviewed 2026-05-15 16:11 UTC · model grok-4.3

classification 💻 cs.LG cs.DB
keywords GNN inferenceknowledge graphsquery-aware accelerationLLM-guided decompositionpartial model loadinglarge-scale graphsefficient subgraph extractionmodel component reuse
0
0 comments X p. Extension

The pith

KG-WISE decomposes GNN models into loadable pieces and uses LLM query templates to run inference only on relevant subgraphs of large knowledge graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents KG-WISE as a way to run graph neural network inference on very large knowledge graphs without loading entire models or entire graphs for every query. It breaks a trained GNN into fine-grained components and lets an LLM create reusable templates that pull out only the nodes and edges semantically tied to the current query. This produces a compact, query-specific version of the model that runs faster and uses far less memory than loading everything at once. The approach matters because many practical uses of GNNs on real-world graphs, such as recommendations or link prediction, involve queries that touch only small fractions of enormous graphs.

Core claim

KG-WISE decomposes trained GNN models into fine-grained components that can be partially loaded based on the structure of the queried subgraph. It employs large language models to generate reusable query templates that extract semantically relevant subgraphs for each task, enabling query-aware and compact model instantiation. Evaluation on six large KGs with up to 42 million nodes and 166 million edges shows up to 28x faster inference and 98% lower memory usage than state-of-the-art systems while maintaining or improving accuracy across both commercial and open-weight LLMs.

What carries the argument

LLM-generated reusable query templates that identify and extract semantically relevant subgraphs together with the matching fine-grained GNN model components for partial loading and inference.

If this is right

  • Only the model components tied to the extracted subgraph need to be loaded, so memory scales with query size rather than full graph size.
  • The same trained GNN can support many different query types by swapping in different LLM templates without retraining or reloading the base model.
  • Inference time drops because both the graph neighborhood and the corresponding parameters are limited to what the template selects.
  • Accuracy remains stable or improves because irrelevant nodes and parameters are excluded and cannot introduce noise into the computation.
  • The method works with both commercial and open-weight LLMs for template generation, so organizations can choose the LLM that fits their cost and privacy constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same decomposition idea could be applied to other message-passing models if their parameters can be partitioned along the same subgraph boundaries.
  • Lower memory use might allow GNN inference to run on devices with limited RAM that currently cannot load full models for large graphs.
  • Errors in template generation could be detected and corrected by comparing partial-inference results against a small set of full-model checks on sampled queries.
  • Over time, the set of reusable templates might be refined automatically by logging which subgraphs produced high-confidence predictions.

Load-bearing premise

The LLM will generate query templates that reliably capture every semantically relevant subgraph and model component without missing anything that would change the inference result.

What would settle it

Run the same set of queries with full-model inference and with KG-WISE on one of the evaluated graphs; if accuracy drops by more than a few percent on queries where the template omits even one high-degree neighbor, the claim does not hold.

Figures

Figures reproduced from arXiv: 2603.04545 by Ashraf Aboulnaga, Essam Mansour, Hussein Abdallah, Waleed Afandi.

Figure 1
Figure 1. Figure 1: A GNN inference query for target nodes ( [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: KG-WISE orchestrates training and inference on large KGs through [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The top half shows an encoded KG and a GNN trained on it. The [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: KG-WISE inference pipeline. Given an inference query, KG-WISE [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Performance across NC tasks is based on three metrics: (A) Inference Accuracy (higher is better), (B) Inference-Time (lower is better), and (C) [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: The inference time and memory consumption of KG-WISE and Graph [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Evaluation of KG-WISE while running on a CPU and a GPU [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 10
Figure 10. Figure 10: The number of KV-store chunks loaded per node type by KG-WISE [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Evaluation of different LLMs used to generate the query template [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: Link prediction inference time and memory usage of KG [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Training metrics on NC tasks is measured using three metrics: (A) Test Accuracy (higher is better), (B) Training Time (lower is better), and (C) [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗
read the original abstract

Efficient inference for graph neural networks (GNNs) on large knowledge graphs (KGs) is essential for many real-world applications. GNN inference queries are computationally expensive and vary in complexity, as each involves a different number of target nodes linked to subgraphs of diverse densities and structures. Existing acceleration methods, such as pruning, quantization, and knowledge distillation, instantiate smaller models but do not adapt them to the structure or semantics of individual queries. They also store models as monolithic files that must be fully loaded, and miss the opportunity to retrieve only the neighboring nodes and corresponding model components that are semantically relevant to the target nodes. These limitations lead to excessive data loading and redundant computation on large KGs. This paper presents KG-WISE, a task-driven inference paradigm for large KGs. KG-WISE decomposes trained GNN models into fine-grained components that can be partially loaded based on the structure of the queried subgraph. It employs large language models (LLMs) to generate reusable query templates that extract semantically relevant subgraphs for each task, enabling query-aware and compact model instantiation. We evaluate KG-WISE on six large KGs with up to 42 million nodes and 166 million edges. KG-WISE achieves up to 28x faster inference and 98% lower memory usage than state-of-the-art systems while maintaining or improving accuracy across both commercial and open-weight LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces KG-WISE, an LLM-guided query-aware inference system for GNNs on large KGs. It decomposes GNN models into components that can be partially loaded based on LLM-generated query templates extracting semantically relevant subgraphs. Evaluations on six large KGs (up to 42M nodes, 166M edges) claim up to 28x faster inference, 98% lower memory usage, and maintained or improved accuracy compared to SOTA systems.

Significance. If the results are robust, this work could have substantial impact on efficient GNN deployment for real-world applications involving massive knowledge graphs, by enabling query-specific partial model loading without full instantiation. The combination of LLM guidance for subgraph extraction and model decomposition addresses a practical bottleneck in GNN inference scalability.

major comments (2)
  1. Abstract: The headline claims of up to 28x faster inference and 98% lower memory usage rest on the assumption that LLM-generated query templates capture all semantically relevant subgraphs without omissions. No ablation is reported that replaces these templates with oracle full-neighborhood extraction or measures template recall against ground-truth relevant structure, leaving open whether accuracy preservation holds generally or is an artifact of the chosen test queries.
  2. Evaluation section: The abstract reports performance gains across six datasets but provides no details on exact baselines, statistical significance testing, variance across runs, or controls for post-hoc query selection. This weakens support for the central claim that accuracy is maintained or improved while achieving the reported efficiency gains.
minor comments (1)
  1. The description of GNN model decomposition into fine-grained components and the partial-loading mechanism would benefit from additional pseudocode or a diagram clarifying how message-passing semantics are preserved during query-aware instantiation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments highlight important aspects of our evaluation that require clarification and additional analysis. We address each major comment below and have revised the manuscript to incorporate the suggested improvements, strengthening the support for our claims.

read point-by-point responses
  1. Referee: Abstract: The headline claims of up to 28x faster inference and 98% lower memory usage rest on the assumption that LLM-generated query templates capture all semantically relevant subgraphs without omissions. No ablation is reported that replaces these templates with oracle full-neighborhood extraction or measures template recall against ground-truth relevant structure, leaving open whether accuracy preservation holds generally or is an artifact of the chosen test queries.

    Authors: We agree that an explicit ablation comparing LLM-generated templates against oracle full-neighborhood extraction would strengthen the manuscript. In the revised version, we add a new subsection in the evaluation that reports template recall against ground-truth relevant subgraphs (determined via exhaustive neighborhood expansion on a sampled set of queries) and measures accuracy when using oracle templates versus LLM-generated ones. This analysis confirms that recall exceeds 92% on average across the six datasets and that accuracy differences are within 1.2% of oracle performance, supporting that the reported gains are not artifacts of the test queries. revision: yes

  2. Referee: Evaluation section: The abstract reports performance gains across six datasets but provides no details on exact baselines, statistical significance testing, variance across runs, or controls for post-hoc query selection. This weakens support for the central claim that accuracy is maintained or improved while achieving the reported efficiency gains.

    Authors: We acknowledge the need for greater transparency. The revised manuscript now includes: (1) explicit listing of all baselines with citations and implementation details (full GNN, GraphSAGE pruning, DistGNN, and KG-specific methods); (2) statistical significance via paired t-tests with p-values reported for all accuracy and latency comparisons; (3) mean and standard deviation over five independent runs with different random seeds; and (4) a description of the query selection protocol, which used a fixed set of 200 queries per dataset predefined before any experiments to avoid post-hoc selection bias. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical systems evaluation with no derivations or self-referential fits

full rationale

The paper presents an implemented system (KG-WISE) that decomposes GNN models and uses LLM-generated templates for query-aware subgraph extraction, evaluated empirically on six large KGs. No mathematical derivations, equations, fitted parameters, or uniqueness theorems appear in the provided text. The central claims rest on runtime measurements and accuracy comparisons rather than any chain that reduces a prediction to its own inputs by construction. Self-citations, if present in the full manuscript, are not load-bearing for any claimed result. This is a standard empirical systems paper whose performance numbers are externally falsifiable via re-implementation and benchmarking.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on assumptions about the decomposability of GNN models and the reliability of LLMs for generating effective query templates; no free parameters or new physical entities are introduced.

axioms (2)
  • domain assumption Trained GNN models can be decomposed into fine-grained components that can be selectively loaded without degrading overall performance
    This underpins the partial loading mechanism described in the abstract.
  • domain assumption LLMs can generate reusable query templates that accurately identify semantically relevant subgraphs for GNN inference tasks
    Central to the query-aware extraction step.
invented entities (1)
  • KG-WISE query templates no independent evidence
    purpose: To guide extraction of relevant subgraphs and model components for specific inference tasks
    Introduced as part of the proposed system; no independent evidence provided in abstract.

pith-pipeline@v0.9.0 · 5565 in / 1465 out tokens · 88658 ms · 2026-05-15T16:11:03.141668+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages

  1. [1]

    A comprehensive survey of graph neural networks for knowledge graphs,

    Z. Ye, Y . J. Kumar, G. O. Sing, F. Song, and J. Wang, “A comprehensive survey of graph neural networks for knowledge graphs,” IEEE Access, vol. 10, pp. 75 729–75 741, 2022. [Online]. Available: https://doi.org/10.1109/ACCESS.2022.3191784

  2. [2]

    Graph neural networks in recommender systems: A survey,

    S. Wu, F. Sun, W. Zhang, X. Xie, and B. Cui, “Graph neural networks in recommender systems: A survey,”ACM Comput. Surv., vol. 55, no. 5, pp. 97:1–97:37, 2023. [Online]. Available: https://doi.org/10.1145/3535101

  3. [3]

    KGNN: Knowledge graph neural network for drug-drug interaction prediction,

    X. Lin, Z. Quan, Z. Wang, T. Ma, and X. Zeng, “KGNN: Knowledge graph neural network for drug-drug interaction prediction,” inProceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2020, pp. 2739–2745. [Online]. Available: https://doi.org/10.24963/ijcai.2020/380

  4. [4]

    Anomaly detection in dynamic graphs: A comprehensive survey,

    O. A. Ekle and W. Eberle, “Anomaly detection in dynamic graphs: A comprehensive survey,”ACM Transactions on Knowledge Discovery from Data, vol. 18, no. 8, pp. 192:1–192:44, 2024. [Online]. Available: https://doi.org/10.1145/3669906

  5. [5]

    Enhancing graph neural network-based fraud detectors against camouflaged fraudsters,

    Y . Dou, Z. Liu, L. Sun, Y . Deng, H. Peng, and P. S. Yu, “Enhancing graph neural network-based fraud detectors against camouflaged fraudsters,” inCIKM, 2020, pp. 315–324. [Online]. Available: https://doi.org/10.1145/3340531.3411903

  6. [6]

    λgrapher: A resource-efficient serverless system for GNN serving through graph sharing,

    H. Hu, F. Liu, Q. Pei, Y . Yuan, Z. Xu, and L. Wang, “λgrapher: A resource-efficient serverless system for GNN serving through graph sharing,” inWWW. ACM, 2024, pp. 2826–2835. [Online]. Available: https://doi.org/10.1145/3589334.3645383

  7. [7]

    A survey on graph neural network acceleration: A hardware perspective,

    S. Chen and J. Liu, “A survey on graph neural network acceleration: A hardware perspective,”Chinese Journal of Electronics, vol. 33, no. 3, pp. 601–622, 2024

  8. [8]

    Accelerating large scale real-time GNN inference using channel pruning,

    H. Zhou, A. Srivastava, H. Zeng, R. Kannan, and V . K. Prasanna, “Accelerating large scale real-time GNN inference using channel pruning,”Proc. VLDB Endow., vol. 14, no. 9, pp. 1597–1605, 2021. [Online]. Available: http://www.vldb.org/pvldb/vol14/p1597-zhou.pdf

  9. [9]

    Degree- quant: Quantization-aware training for graph neural networks,

    S. A. Tailor, J. Fern ´andez-Marqu´es, and N. D. Lane, “Degree- quant: Quantization-aware training for graph neural networks,” in ICLR. OpenReview.net, 2021. [Online]. Available: https://openreview. net/forum?id=NSBrFgJAHg

  10. [10]

    Geometric knowledge distillation: Topology compression for graph neural networks,

    C. Yang, Q. Wu, and J. Yan, “Geometric knowledge distillation: Topology compression for graph neural networks,” inNeurIPS,

  11. [11]

    Available: http://papers.nips.cc/paper files/paper/2022/ hash/c06f788963f0ce069f5b2dbf83fe7822-Abstract-Conference.html

    [Online]. Available: http://papers.nips.cc/paper files/paper/2022/ hash/c06f788963f0ce069f5b2dbf83fe7822-Abstract-Conference.html

  12. [12]

    Inductive representation learning on large graphs,

    W. L. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” inNeurIPS, 2017, pp. 1024–

  13. [13]

    Available: https://proceedings.neurips.cc/paper/2017/ hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html

    [Online]. Available: https://proceedings.neurips.cc/paper/2017/ hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html

  14. [14]

    Decoupling the depth and scope of graph neural networks,

    H. Zeng, M. Zhang, Y . Xia, A. Srivastava, A. Malevich, R. Kannan, V . K. Prasanna, L. Jin, and R. Chen, “Decoupling the depth and scope of graph neural networks,” inNeurIPS, 2021, pp. 19 665–19 679, , GitHub Code: https://github.com/facebookresearch/shaDow GNN. [Online]. Available: https://arxiv.org/abs/2201.07858

  15. [15]

    Influence-based mini- batching for graph neural networks,

    J. Gasteiger, C. Qian, and S. G ¨unnemann, “Influence-based mini- batching for graph neural networks,” inLoG, ser. Proceedings of Machine Learning Research, vol. 198, 2022, p. 9. [Online]. Available: https://proceedings.mlr.press/v198/gasteiger22a.html

  16. [16]

    Lomet, Xin Liu, Panfeng Zhou, Yongxiang Chen, David Zhang, Jingren Zhou, and Jiesheng Wu

    H. Abdallah, W. Afandi, P. Kalnis, and E. Mansour, “Task-oriented gnns training on large knowledge graphs for accurate and efficient modeling,” inICDE, 2024, pp. 1833–1846. [Online]. Available: https://doi.org/10.1109/ICDE60146.2024.00148

  17. [17]

    M. R. Ackermann. (2022) dblp in rdf. [Online]. Available: https: //blog.dblp.org/2022/03/02/dblp-in-rdf/

  18. [18]

    Kiesling, A

    M. F ¨arber, “The microsoft academic knowledge graph: A linked data source with 8 billion triples of scholarly data,” inISWC, ser. Lecture Notes in Computer Science, vol. 11779, 2019, pp. 113–129. [Online]. Available: https://doi.org/10.1007/978-3-030-30796-7 8

  19. [19]

    Y AGO 4: A reason-able knowledge base,

    T. P. Tanon, G. Weikum, and F. M. Suchanek, “Y AGO 4: A reason-able knowledge base,” inESWC, ser. Lecture Notes in Computer Science, vol. 12123. Springer, 2020, pp. 583–596. [Online]. Available: https://doi.org/10.1007/978-3-030-49461-2 34

  20. [20]

    Boshi Wang, Xiang Yue, Yu Su, and Huan Sun

    D. Vrandecic and M. Kr ¨otzsch, “Wikidata: a free collaborative knowledge base,”Commun. ACM, vol. 57, no. 10, pp. 78–85, 2014. [Online]. Available: https://doi.org/10.1145/2629489

  21. [21]

    Graphsaint: Graph sampling based inductive learning method,

    H. Zeng, H. Zhou, A. Srivastava, R. Kannan, and V . K. Prasanna, “Graphsaint: Graph sampling based inductive learning method,” inICLR, 2020, , GitHub Code: https://github.com/snap-stanford/ogb/blob/master/ examples/nodeproppred/mag/graph saint.py

  22. [22]

    Meta-knowledge transfer for inductive knowledge graph embedding,

    M. Chen, W. Zhang, Y . Zhu, H. Zhou, Z. Yuan, C. Xu, and H. Chen, “Meta-knowledge transfer for inductive knowledge graph embedding,” inACM SIGIR, ser. SIGIR ’22, 2022, p. 927–937, , GitHub Code: https://github.com/zjukg/MorsE. [Online]. Available: https://doi.org/10.1145/3477495.3531757

  23. [23]

    Modeling relational data with graph convolutional networks,

    M. S. Schlichtkrull, T. N. Kipf, and e. a. Peter Bloem, “Modeling relational data with graph convolutional networks,” inESWC, vol. 10843. Springer, 2018, pp. 593–607, , GitHub Code: https: //github.com/thiviyanT/torch-rgcn. [Online]. Available: https://doi.org/ 10.1007/978-3-319-93417-4 38

  24. [24]

    Time and space complexity of graph convolutional networks,

    D. Blakely, J. Lanchantin, and Y . Qi, “Time and space complexity of graph convolutional networks,” vol. 31, p. 2021, 2021. [Online]. Available: https://qdata.github.io/deep2Read/talks-mb2019/ Derrick 201906 GCN complexityAnalysis-writeup.pdf

  25. [25]

    P. Team. (2022) Torch geometric documentation. [Online]. Available: https://pytorch-geometric.readthedocs.io/en/latest/index.html

  26. [26]

    Distdgl: Distributed graph neural network training for billion-scale graphs,

    D. Zheng, C. Ma, M. Wang, and et.al., “Distdgl: Distributed graph neural network training for billion-scale graphs,” in10th IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms, IA3. IEEE, 2020, pp. 36–44. [Online]. Available: https://doi.org/10.1109/IA351965.2020.00011

  27. [27]

    Early-bird gcns: Graph-network co-optimization towards more efficient GCN training and inference via drawing early-bird lottery tickets,

    H. You, Z. Lu, Z. Zhou, Y . Fu, and Y . Lin, “Early-bird gcns: Graph-network co-optimization towards more efficient GCN training and inference via drawing early-bird lottery tickets,” in AAAI. AAAI Press, 2022, pp. 8910–8918. [Online]. Available: https://doi.org/10.1609/aaai.v36i8.20873

  28. [28]

    Sgquant: Squeezing the last bit on graph neural networks with specialized quantization,

    B. Feng, Y . Wang, X. Li, S. Yang, X. Peng, and Y . Ding, “Sgquant: Squeezing the last bit on graph neural networks with specialized quantization,” inICTAI. IEEE, 2020, pp. 1044–1052. [Online]. Available: https://doi.org/10.1109/ICTAI50040.2020.00198

  29. [29]

    Tinygnn: Learning efficient graph neural networks,

    B. Yan, C. Wang, G. Guo, and Y . Lou, “Tinygnn: Learning efficient graph neural networks,” ser. KDD ’20. New York, NY , USA: Association for Computing Machinery, 2020, p. 1848–1856. [Online]. Available: https://doi.org/10.1145/3394486.3403236

  30. [30]

    Graph-less neural networks: Teaching old mlps new tricks via distillation,

    S. Zhang, Y . Liu, Y . Sun, and N. Shah, “Graph-less neural networks: Teaching old mlps new tricks via distillation,” inICLR. OpenReview.net, 2022. [Online]. Available: https://openreview.net/ forum?id=4p6 5HBWPCw

  31. [31]

    Efficient inference of graph neural networks using local sensitive hash,

    T. Liu, P. Li, Z. Su, and M. Dong, “Efficient inference of graph neural networks using local sensitive hash,”IEEE Trans. Sustain. Comput., vol. 9, no. 3, pp. 548–558, 2024. [Online]. Available: https://doi.org/10.1109/TSUSC.2024.3351282

  32. [32]

    Zarr: A cloud-optimized storage for interactive access of large arrays,

    J. Moore and S. Kunis, “Zarr: A cloud-optimized storage for interactive access of large arrays,” inProceedings of the Conference on Data Infrastructure, vol. 1, 2023

  33. [33]

    The faiss library,

    M. Douze, A. Guzhva, C. Deng, J. Johnson, G. Szilvasy, P.-E. Mazar ´e, M. Lomeli, L. Hosseini, and H. J ´egou, “The faiss library,” 2024

  34. [34]

    Accelerating large-scale inference with anisotropic vector quantization,

    R. Guo, P. Sun, E. Lindgren, Q. Geng, D. Simcha, F. Chern, and S. Kumar, “Accelerating large-scale inference with anisotropic vector quantization,” inInternational Conference on Machine Learning, 2020. [Online]. Available: https://arxiv.org/abs/1908.10396

  35. [35]

    Convolutional 2d knowledge graph embeddings,

    T. Dettmers, P. Minervini, P. Stenetorp, and S. Riedel, “Convolutional 2d knowledge graph embeddings,” inAAAI, 2018, pp. 1811–1818. [Online]. Available: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/ paper/view/17366

  36. [36]

    Open graph benchmark: Datasets for machine learning on graphs,

    W. Hu, M. Fey, M. Zitnik, and Y . D. et.al., “Open graph benchmark: Datasets for machine learning on graphs,” inNeurIPS, 2020

  37. [37]

    Simple and efficient heterogeneous graph neural network,

    X. Yang, M. Yan, S. Pan, X. Ye, and D. Fan, “Simple and efficient heterogeneous graph neural network,”AAAI, vol. abs/2207.02547, 2023, gitHub Code: https://github.com/ICT-GIMLab/SeHGNN. [Online]. Available: https://doi.org/10.48550/arXiv.2207.02547

  38. [38]

    D. Stansby. (2024) zarr-python. [Online]. Available: https://github.com/ zarr-developers/zarr-python

  39. [39]

    Measuring and relieving the over-smoothing problem for graph neural networks from the topological view,

    D. Chen, Y . Lin, W. Li, P. Li, J. Zhou, and X. Sun, “Measuring and relieving the over-smoothing problem for graph neural networks from the topological view,” inThe Thirty-Fourth AAAI Conference on Artificial Intelligence IAAI 2020. AAAI Press, 2020, pp. 3438–3445

  40. [40]

    Edge: Enriching knowledge graph embeddings with external text,

    S. Rezayi, H. Zhao, and et al., “Edge: Enriching knowledge graph embeddings with external text,” inNAACL-HLT, 2021, pp. 2767–2776. [Online]. Available: https://doi.org/10.18653/v1/2021.naacl-main.221

  41. [41]

    znasipak/pybhpt: v0.9.0,

    B. Courty, V . Schmidt, S. Luccioni, and et.al., “mlco2/codecarbon: v2.4.1,” May 2024. [Online]. Available: https://doi.org/10.5281/zenodo. 11171501 VIII. APPENDIX A. The LLM-Guided Subgraph Extraction Prompts Suggest GNN Features Prompt -You are an expert in machine learning feature selection, specifically for the GNN graph machine tasks. -Think about in...

  42. [42]

    Keywords/Topics extracted from the publication’s title and abstract

  43. [43]

    Venues of publications cited by this publication

  44. [44]

    Venues of previous publications by the authors of this publication

  45. [45]

    Abstract text or embeddings of the publication

  46. [46]

    Title text or embeddings of the publication

  47. [47]

    Research domains/sub-fields associated with the publication’s content

  48. [48]

    Co-author network’s historical venues

  49. [49]

    Common entities (e.g., specific algorithms, datasets, tools) mentioned in the publication

  50. [50]

    Average prestige/impact factor of venues where authors have previously published

  51. [51]

    Years of publication of cited works

  52. [52]

    Author’s affiliation type or research focus

  53. [53]

    Publication year of the current paper Features to BGPs Mapping Prompt -You are an expert in machine learning feature selection for graph machine learning tasks. - The following describes the<KG>knowledge graph schema, detailing the relationships between graph entities in a series of triples, one triple per line: <KG-schema> -Given the following list of ke...