pith. sign in

arxiv: 2603.02938 · v2 · pith:GBDRN2KFnew · submitted 2026-03-03 · 💻 cs.LG · cs.AI

Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

Pith reviewed 2026-05-22 10:26 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords zero-shot graph learninglarge language modelssubgraph denoisingadaptive subgraph extractionreinforcement learningsupervised fine-tuninggraph reasoning
0
0 comments X

The pith

An adaptive Sample-Select-Reason pipeline lets LLMs extract denoised subgraphs and improve zero-shot graph reasoning without one-size-fits-all noise.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets the structural noise that arises when LLMs receive fixed, task-agnostic subgraphs for zero-shot graph tasks. It replaces that fixed extraction with a dynamic Sample-Select-Reason process that lets the model choose only relevant neighbors and edges for each query. High-quality SSR reasoning traces are synthesized for supervised fine-tuning, then a two-stage reinforcement learning procedure reinforces both prediction accuracy and subgraph parsimony. The resulting models therefore operate on smaller, cleaner subgraphs while maintaining or improving predictive performance across unseen domains and label spaces.

Core claim

GraphSSR overcomes the one-size-fits-all subgraph extraction used in prior LLM-based graph reasoning by introducing an SSR pipeline that samples candidate neighbors, selects task-relevant ones, and reasons over the resulting parsimonious subgraph. SSR-SFT creates synthetic traces that teach this behavior, and SSR-RL applies authenticity-reinforced and denoising-reinforced objectives so the model learns to filter irrelevant structure while preserving accuracy.

What carries the argument

The SSR pipeline, a Sample-Select-Reason loop that dynamically tailors subgraph extraction to each context and is internalized via supervised fine-tuning followed by two-stage reinforcement learning.

If this is right

  • LLMs can autonomously filter task-irrelevant neighbors during zero-shot graph inference.
  • Reasoning occurs over smaller, denoised subgraphs rather than full neighborhoods.
  • Zero-shot generalization improves across domains and label spaces without retraining GNN components.
  • Purely text-based graph reasoning avoids cross-modal alignment problems between GNNs and LLMs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same SSR loop could be applied to other structured inputs such as knowledge graphs or molecular graphs without architectural changes.
  • Reduced subgraph size may lower token usage and inference cost for large-scale graph queries.
  • Combining SSR with existing graph sampling heuristics might further improve the quality of the initial candidate set.

Load-bearing premise

The synthesized SSR-style reasoning traces must be high-quality enough for fine-tuning, and the two-stage RL must train the model to balance accuracy against denoising without introducing new biases or mode collapse.

What would settle it

Run the trained model on held-out graphs and measure the fraction of irrelevant neighbors retained in the final subgraphs; if this fraction remains high while accuracy stays unchanged or drops, the adaptive-denoising claim is falsified.

Figures

Figures reproduced from arXiv: 2603.02938 by Fanyu Meng, Fengzhi Li, Junlan Feng, Liang Zhang, Ruiqing Zhao, Yansong Liu, Yuan Zuo, Yunfei Ma.

Figure 1
Figure 1. Figure 1: A case of structural noise in the subgraph mislead [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our proposed GraphSSR framework. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Ablation study of GraphSSR on different datasets. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Parameter Sensitivity of 𝜆. 0.01), the performance is suboptimal. This is primarily because a small 𝑟𝑠 weight fails to impose sufficient pressure on the model to prune noisy structures, leading the model to suffer from the interference of noisy nodes and edges. As 𝜆 increases, we observe a steady improvement in accuracy, reaching a peak at 𝜆 = 0.1. This upward trend validates that introducing a properly we… view at source ↗
read the original abstract

Graph-based tasks in the zero-shot setting remain a significant challenge due to data scarcity and the inability of traditional Graph Neural Networks (GNNs) to generalize to unseen domains or label spaces. While recent advancements have transitioned toward leveraging Large Language Models (LLMs) as predictors to enhance GNNs, these methods often suffer from cross-modal alignment issues. A recent paradigm (i.e., Graph-R1) overcomes the aforementioned architectural dependencies by adopting a purely text-based format and utilizing LLM-based graph reasoning, showing improved zero-shot generalization. However, it employs a task-agnostic, one-size-fits-all subgraph extraction strategy, which inevitably introduces significant structural noise--irrelevant neighbors and edges--that distorts the LLMs' receptive field and leads to suboptimal predictions. To address this limitation, we introduce GraphSSR, a novel framework designed for adaptive subgraph extraction and denoising in zero-shot LLM-based graph reasoning. Specifically, we propose the SSR pipeline, which dynamically tailors subgraph extraction to specific contexts through a "Sample-Select-Reason" process, enabling the model to autonomously filter out task-irrelevant neighbors and overcome the one-size-fits-all issue. To internalize this capability, we develop SSR-SFT, a data synthesis strategy that generates high-quality SSR-style graph reasoning traces for supervised fine-tuning of LLMs. Furthermore, we propose SSR-RL, a two-stage reinforcement learning framework that explicitly regulates sampling and selection operations within the proposed SSR pipeline designed for adaptive subgraph denoising. By incorporating Authenticity-Reinforced and Denoising-Reinforced RL, we guide the model to achieve accurate predictions using parsimonious, denoised subgraphs for reasoning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes GraphSSR, a framework extending Graph-R1 for zero-shot LLM-based graph reasoning. It introduces the SSR (Sample-Select-Reason) pipeline for adaptive, context-specific subgraph extraction to filter task-irrelevant neighbors, SSR-SFT for synthesizing high-quality SSR-style reasoning traces to enable supervised fine-tuning, and SSR-RL, a two-stage reinforcement learning approach with Authenticity-Reinforced and Denoising-Reinforced objectives to train models toward accurate predictions on parsimonious, denoised subgraphs.

Significance. If the central claims hold, the work would meaningfully advance zero-shot graph learning by replacing one-size-fits-all subgraph extraction with an internalized adaptive denoising capability. The two-stage RL formulation for jointly regulating sampling/selection and accuracy is a technically interesting direction that could reduce structural noise and improve generalization across unseen domains and label spaces.

major comments (2)
  1. [Abstract; SSR-RL framework description] The abstract and method description claim that SSR-SFT plus SSR-RL enable LLMs to autonomously filter irrelevant neighbors and achieve accurate predictions with denoised subgraphs, yet no experimental results, ablation studies, error bars, or quantitative comparisons against Graph-R1 or other baselines are supplied. This absence makes it impossible to verify whether the data or derivations support the performance claims.
  2. [SSR-SFT and SSR-RL sections] The framework rests on the assumption that synthesized SSR-style traces are high-quality and that the two-stage (Authenticity-Reinforced + Denoising-Reinforced) RL balances accuracy against subgraph size without mode collapse, reward hacking, or new biases. No analysis, reward-signal orthogonality checks, or post-RL diversity measurements are provided to substantiate this load-bearing assumption.
minor comments (2)
  1. The description of the SSR pipeline would benefit from pseudocode or a formal algorithmic outline to clarify the Sample, Select, and Reason steps and their integration with the LLM.
  2. Notation for the RL reward components (authenticity vs. denoising) could be made more explicit to avoid ambiguity in how the two objectives are combined.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review. The comments highlight important gaps in empirical validation and analysis that we will address in the revision. Below we respond point by point to the major comments.

read point-by-point responses
  1. Referee: [Abstract; SSR-RL framework description] The abstract and method description claim that SSR-SFT plus SSR-RL enable LLMs to autonomously filter irrelevant neighbors and achieve accurate predictions with denoised subgraphs, yet no experimental results, ablation studies, error bars, or quantitative comparisons against Graph-R1 or other baselines are supplied. This absence makes it impossible to verify whether the data or derivations support the performance claims.

    Authors: We agree that the absence of empirical results prevents verification of the performance claims. The submitted manuscript emphasizes the methodological design of the SSR pipeline, SSR-SFT synthesis, and the two-stage SSR-RL objectives. In the revised version we will add a dedicated experimental section containing quantitative comparisons against Graph-R1 and other baselines, ablation studies isolating each component, and results reported with standard error bars over multiple random seeds. revision: yes

  2. Referee: [SSR-SFT and SSR-RL sections] The framework rests on the assumption that synthesized SSR-style traces are high-quality and that the two-stage (Authenticity-Reinforced + Denoising-Reinforced) RL balances accuracy against subgraph size without mode collapse, reward hacking, or new biases. No analysis, reward-signal orthogonality checks, or post-RL diversity measurements are provided to substantiate this load-bearing assumption.

    Authors: The referee is correct that the current text provides no supporting analysis for the quality of the synthesized traces or the training dynamics of the two-stage RL procedure. We will revise the manuscript to include (i) quantitative and qualitative assessments of trace quality, (ii) measurements of subgraph-size diversity and prediction accuracy before and after each RL stage, and (iii) explicit checks for mode collapse, reward hacking, and orthogonality between the authenticity and denoising reward signals. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; derivation is self-contained as an independent extension.

full rationale

The paper introduces GraphSSR as a new framework with the SSR pipeline (Sample-Select-Reason), SSR-SFT for synthesizing reasoning traces, and SSR-RL with Authenticity-Reinforced and Denoising-Reinforced stages. No equations, fitted parameters, or self-definitional reductions appear in the provided text. The method is framed as addressing limitations in the prior Graph-R1 paradigm through novel adaptive extraction and training procedures rather than re-deriving or renaming existing results by construction. Central claims rest on the proposed processes for filtering neighbors and balancing accuracy with parsimony, which are presented as independent contributions without load-bearing self-citations or ansatzes that collapse back to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only view; no explicit free parameters, new entities, or ad-hoc axioms are stated. The approach rests on standard domain assumptions about LLM fine-tuning and RL for structured reasoning.

axioms (2)
  • domain assumption LLMs can be effectively fine-tuned and reinforced to perform graph reasoning from text representations of subgraphs
    Implicit in the shift to purely text-based LLM graph reasoning and the SSR-SFT/RL training strategy.
  • domain assumption High-quality SSR-style reasoning traces can be synthesized to bootstrap the adaptive denoising capability
    Central premise of the SSR-SFT component.

pith-pipeline@v0.9.0 · 5858 in / 1471 out tokens · 54450 ms · 2026-05-22T10:26:27.482427+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Unified Graph Language Model for Multi-Domain Multi-Task Graph Alignment Instruction Tuning

    cs.LG 2026-05 unverdicted novelty 6.0

    UniGraphLM uses a multi-domain multi-task GNN encoder and adaptive alignment to create unified graph tokens for LLMs across diverse domains and tasks.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · cited by 1 Pith paper · 3 internal anchors

  1. [1]

    Improving social network embedding via new second-order continuous graph neural networks

    Yanfu Zhang, Shangqian Gao, Jian Pei, and Heng Huang. Improving social network embedding via new second-order continuous graph neural networks. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 2515–2523, 2022

  2. [2]

    Fgdgnn: Fine- grained dynamic graph neural network for rumor detection on social media

    Mei Guo, Chen Chen, Chunyan Hou, Yike Wu, and Xiaojie Yuan. Fgdgnn: Fine- grained dynamic graph neural network for rumor detection on social media. InFindings of the Association for Computational Linguistics: ACL 2025, pages 5676–5687, 2025

  3. [3]

    Graph representation learning in bioinformatics: trends, methods and applications

    Hai-Cheng Yi, Zhu-Hong You, De-Shuang Huang, and Chee Keong Kwoh. Graph representation learning in bioinformatics: trends, methods and applications. Briefings in Bioinformatics, 23(1):bbab340, 2022

  4. [4]

    Graph neural networks in modern ai-aided drug discovery.Chemical Reviews, 125(20):10001–10103, 2025

    Odin Zhang, Haitao Lin, Xujun Zhang, Xiaorui Wang, Zhenxing Wu, Qing Ye, Weibo Zhao, Jike Wang, Kejun Ying, Yu Kang, et al. Graph neural networks in modern ai-aided drug discovery.Chemical Reviews, 125(20):10001–10103, 2025

  5. [5]

    A survey of graph neural networks for recommender systems: Challenges, methods, and directions.ACM Transactions on Recommender Systems, 1(1):1–51, 2023

    Chen Gao, Yu Zheng, Nian Li, Yinfeng Li, Yingrong Qin, Jinghua Piao, Yuhan Quan, Jianxin Chang, Depeng Jin, Xiangnan He, et al. A survey of graph neural networks for recommender systems: Challenges, methods, and directions.ACM Transactions on Recommender Systems, 1(1):1–51, 2023

  6. [6]

    A novel session-based recommendation system using capsule graph neural network.Neural Networks, 185:107176, 2025

    Driss El Alaoui, Jamal Riffi, Abdelouahed Sabri, Badraddine Aghoutane, Ali Yahyaouy, and Hamid Tairi. A novel session-based recommendation system using capsule graph neural network.Neural Networks, 185:107176, 2025

  7. [7]

    Gimlet: A unified graph-text model for instruction- based molecule zero-shot learning.Advances in neural information processing systems, 36:5850–5887, 2023

    Haiteng Zhao, Shengchao Liu, Ma Chang, Hannan Xu, Jie Fu, Zhihong Deng, Lingpeng Kong, and Qi Liu. Gimlet: A unified graph-text model for instruction- based molecule zero-shot learning.Advances in neural information processing systems, 36:5850–5887, 2023

  8. [8]

    A review of few-shot and zero- shot learning for node classification in social networks.IEEE Transactions on Computational Social Systems, 2024

    Junyang Chen, Rui Mi, Huan Wang, Huisi Wu, Jiqian Mo, Jingcai Guo, Zhihui Lai, Liangjie Zhang, and Victor CM Leung. A review of few-shot and zero- shot learning for node classification in social networks.IEEE Transactions on Computational Social Systems, 2024

  9. [9]

    Dynamic text bundling supervision for zero-shot inference on text-attributed graphs.arXiv preprint arXiv:2505.17599, 2025

    Yusheng Zhao, Qixin Zhang, Xiao Luo, Weizhi Zhang, Zhiping Xiao, Wei Ju, Philip S Yu, and Ming Zhang. Dynamic text bundling supervision for zero-shot inference on text-attributed graphs.arXiv preprint arXiv:2505.17599, 2025

  10. [10]

    One for all: Towards training one graph model for all classifi- cation tasks.arXiv preprint arXiv:2310.00149, 2023

    Hao Liu, Jiarui Feng, Lecheng Kong, Ningyue Liang, Dacheng Tao, Yixin Chen, and Muhan Zhang. One for all: Towards training one graph model for all classifi- cation tasks.arXiv preprint arXiv:2310.00149, 2023

  11. [11]

    Zerog: Investigating cross-dataset zero-shot transferability in graphs

    Yuhan Li, Peisong Wang, Zhixun Li, Jeffrey Xu Yu, and Jia Li. Zerog: Investigating cross-dataset zero-shot transferability in graphs. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1725–1735, 2024

  12. [12]

    Semantic refinement with llms for graph representations.arXiv preprint arXiv:2512.21106, 2025

    Safal Thapaliya, Zehong Wang, Jiazheng Li, Ziming Li, Yanfang Ye, and Chuxu Zhang. Semantic refinement with llms for graph representations.arXiv preprint arXiv:2512.21106, 2025

  13. [13]

    Graphgpt: Graph instruction tuning for large language models

    Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Lixin Su, Suqi Cheng, Dawei Yin, and Chao Huang. Graphgpt: Graph instruction tuning for large language models. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 491–500, 2024

  14. [14]

    Gofa: A generative one-for-all model for joint graph language modeling.arXiv preprint arXiv:2407.09709, 2024

    Lecheng Kong, Jiarui Feng, Hao Liu, Chengsong Huang, Jiaxin Huang, Yixin Chen, and Muhan Zhang. Gofa: A generative one-for-all model for joint graph language modeling.arXiv preprint arXiv:2407.09709, 2024

  15. [15]

    Graph-r1: Incentivizing the zero-shot graph learning capability in llms via explicit reasoning

    Yicong Wu, Guangyue Lu, Yuan Zuo, Huarong Zhang, and Junjie Wu. Graph-r1: Incentivizing the zero-shot graph learning capability in llms via explicit reasoning. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 23920–23938, 2025

  16. [16]

    A survey of graph neural networks in real world: Imbalance, noise, privacy and ood challenges.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

    Wei Ju, Siyu Yi, Yifan Wang, Zhiping Xiao, Zhengyang Mao, Hourun Li, Yiyang Gu, Yifang Qin, Nan Yin, Senzhang Wang, et al. A survey of graph neural networks in real world: Imbalance, noise, privacy and ood challenges.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  17. [17]

    Augmenting low-resource text classification with graph-grounded pre-training and prompting

    Zhihao Wen and Yuan Fang. Augmenting low-resource text classification with graph-grounded pre-training and prompting. InProceedings of the 46th Inter- national ACM SIGIR Conference on Research and Development in Information Retrieval, pages 506–516, 2023

  18. [18]

    DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

    Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, Yang Wu, et al. Deepseekmath: Pushing the limits of mathematical reasoning in open language models.arXiv preprint arXiv:2402.03300, 2024

  19. [19]

    Deepseek-r1: Incentivizing reasoning capability in llms via rein- forcement learning, 2025

    DeepSeek-AI. Deepseek-r1: Incentivizing reasoning capability in llms via rein- forcement learning, 2025

  20. [20]

    Challenges of generating structurally diverse graphs.Advances in Neural Information Pro- cessing Systems, 37:57993–58022, 2024

    Fedor Velikonivtsev, Mikhail Mironov, and Liudmila Prokhorenkova. Challenges of generating structurally diverse graphs.Advances in Neural Information Pro- cessing Systems, 37:57993–58022, 2024

  21. [21]

    and Cangea, C

    Péter Mernyei and Cătălina Cangea. Wiki-cs: A wikipedia-based benchmark for graph neural networks.arXiv preprint arXiv:2007.02901, 2020

  22. [22]

    Harnessing explanations: Llm-to-lm interpreter for enhanced text- attributed graph representation learning.arXiv preprint arXiv:2305.19523, 2023

    Xiaoxin He, Xavier Bresson, Thomas Laurent, Adam Perold, Yann LeCun, and Bryan Hooi. Harnessing explanations: Llm-to-lm interpreter for enhanced text- attributed graph representation learning.arXiv preprint arXiv:2305.19523, 2023

  23. [23]

    Unigraph: Learning a cross-domain graph foundation model from natural language.CoRR, 2024

    Yufei He and Bryan Hooi. Unigraph: Learning a cross-domain graph foundation model from natural language.CoRR, 2024

  24. [24]

    Llaga: Large language and graph assistant.arXiv preprint arXiv:2402.08170, 2024

    Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, and Zhangyang Wang. Llaga: Large language and graph assistant.arXiv preprint arXiv:2402.08170, 2024

  25. [25]

    Qwen3 technical report, 2025

    Qwen Team. Qwen3 technical report, 2025

  26. [26]

    Ministral 3

    Alexander H Liu, Kartik Khandelwal, Sandeep Subramanian, Victor Jouault, Abhinav Rastogi, Adrien Sadé, Alan Jeffares, Albert Jiang, Alexandre Cahill, Alexandre Gavaudan, et al. Ministral 3.arXiv preprint arXiv:2601.08584, 2026

  27. [27]

    Llamafactory: Unified efficient fine-tuning of 100+ language models

    Yaowei Zheng, Richong Zhang, Junhao Zhang, Yanhan Ye, Zheyan Luo, Zhangchi Feng, and Yongqiang Ma. Llamafactory: Unified efficient fine-tuning of 100+ language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), Bangkok, Thailand,

  28. [28]

    Association for Computational Linguistics

  29. [29]

    HybridFlow: A Flexible and Efficient RLHF Framework

    Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, and Chuan Wu. Hybridflow: A flexible and efficient rlhf framework.arXiv preprint arXiv: 2409.19256, 2024

  30. [30]

    Zehong Wang, Sidney Liu, Zheyuan Zhang, Tianyi Ma, Chuxu Zhang, and Yan- fang Ye. Can llms convert graphs to text-attributed graphs? InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1412–1432, 2025

  31. [31]

    Preference-driven knowledge distillation for few-shot node classification.arXiv preprint arXiv:2510.10116, 2025

    Xing Wei, Chunchun Chen, Rui Fan, Xiaofeng Cao, Sourav Medya, and Wei Ye. Preference-driven knowledge distillation for few-shot node classification.arXiv preprint arXiv:2510.10116, 2025

  32. [32]

    Graphtranslator: Aligning graph model to large language model for open-ended tasks

    Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, and Chuan Shi. Graphtranslator: Aligning graph model to large language model for open-ended tasks. InProceedings of the ACM Web Conference 2024, pages 1003–1014, 2024

  33. [33]

    Llms as zero-shot graph learners: Alignment of gnn representations with llm token embeddings.Advances in Neural Information Processing Systems, 37:5950–5973, 2024

    Duo Wang, Yuan Zuo, Fengzhi Li, and Junjie Wu. Llms as zero-shot graph learners: Alignment of gnn representations with llm token embeddings.Advances in Neural Information Processing Systems, 37:5950–5973, 2024. Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Trovato et al

  34. [34]

    Unigte: Unified graph-text encoding for zero-shot generalization across graph tasks and domains.arXiv preprint arXiv:2510.16885, 2025

    Duo Wang, Yuan Zuo, Guangyue Lu, and Junjie Wu. Unigte: Unified graph-text encoding for zero-shot generalization across graph tasks and domains.arXiv preprint arXiv:2510.16885, 2025

  35. [35]

    Beyond text: A deep dive into large language models’ ability on understanding graph data.arXiv preprint arXiv:2310.04944, 2023

    Yuntong Hu, Zheng Zhang, and Liang Zhao. Beyond text: A deep dive into large language models’ ability on understanding graph data.arXiv preprint arXiv:2310.04944, 2023

  36. [36]

    Graph agent: Explicit reasoning agent for graphs.arXiv preprint arXiv:2310.16421, 2023

    Qinyong Wang, Zhenxiang Gao, and Rong Xu. Graph agent: Explicit reasoning agent for graphs.arXiv preprint arXiv:2310.16421, 2023

  37. [37]

    Are large language models in-context graph learners?arXiv preprint arXiv:2502.13562, 2025

    Jintang Li, Ruofan Wu, Yuchang Zhu, Huizhe Zhang, Liang Chen, and Zibin Zheng. Are large language models in-context graph learners?arXiv preprint arXiv:2502.13562, 2025

  38. [38]

    Graphicl: Unlock- ing graph learning potential in llms through structured prompt design.arXiv preprint arXiv:2501.15755, 2025

    Yuanfu Sun, Zhengnan Ma, Yi Fang, Jing Ma, and Qiaoyu Tan. Graphicl: Unlock- ing graph learning potential in llms through structured prompt design.arXiv preprint arXiv:2501.15755, 2025

  39. [39]

    Instructgraph: Boosting large language models via graph-centric instruction tuning and preference alignment.arXiv preprint arXiv:2402.08785, 2024

    Jianing Wang, Junda Wu, Yupeng Hou, Yao Liu, Ming Gao, and Julian McAuley. Instructgraph: Boosting large language models via graph-centric instruction tuning and preference alignment.arXiv preprint arXiv:2402.08785, 2024

  40. [40]

    Language is all a graph needs

    Ruosong Ye, Caiqi Zhang, Runhui Wang, Shuyuan Xu, and Yongfeng Zhang. Language is all a graph needs. InFindings of the association for computational linguistics: EACL 2024, pages 1955–1973, 2024

  41. [41]

    Langgfm: A large language model alone can be a powerful graph foundation model.arXiv preprint arXiv:2410.14961, 2024

    Tianqianjin Lin, Pengwei Yan, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Jun Lin, Weikang Yuan, Junjie Cao, Changlong Sun, and Xiaozhong Liu. Langgfm: A large language model alone can be a powerful graph foundation model.arXiv preprint arXiv:2410.14961, 2024

  42. [42]

    Grip: In-parameter graph reasoning through fine-tuning large language models.arXiv preprint arXiv:2511.07457, 2025

    Jiarui Feng, Donghong Cai, Yixin Chen, and Muhan Zhang. Grip: In-parameter graph reasoning through fine-tuning large language models.arXiv preprint arXiv:2511.07457, 2025

  43. [43]

    Glbench: A comprehensive benchmark for graph with large language models.Advances in Neural Information Processing Systems, 37:42349–42368, 2024

    Yuhan Li, Peisong Wang, Xiao Zhu, Aochuan Chen, Haiyun Jiang, Deng Cai, Victor W Chan, and Jia Li. Glbench: A comprehensive benchmark for graph with large language models.Advances in Neural Information Processing Systems, 37:42349–42368, 2024

  44. [44]

    Open graph benchmark: Datasets for machine learning on graphs.Advances in neural information processing systems, 33:22118–22133, 2020

    Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs.Advances in neural information processing systems, 33:22118–22133, 2020

  45. [45]

    Revisiting semi- supervised learning with graph embeddings

    Zhilin Yang, William Cohen, and Ruslan Salakhudinov. Revisiting semi- supervised learning with graph embeddings. InInternational conference on machine learning, pages 40–48. PMLR, 2016

  46. [46]

    graph analysis expert

    Hao Yan, Chaozhuo Li, Ruosong Long, Chao Yan, Jianan Zhao, Wenwen Zhuang, Jun Yin, Peiyan Zhang, Weihao Han, Hao Sun, et al. A comprehensive study on text-attributed graphs: Benchmarking and rethinking.Advances in Neural Information Processing Systems, 36:17238–17264, 2023. A Prompt Templates A.1 Sample-Select-Reason Pipeline In this section, we provide a...

  47. [47]

    Analyze the differences between the two graph structures in terms of the central node(s), neighboring nodes, and connection relationships

  48. [48]

    Hierarchical logistic belief networks for linear model selection, using Gibbs sam- pling for parameter learning

    Based on the analysis, provide a distance score ranging from 0 to 1 to quantify how different these two graph structures are. A score of 0 indicates that the two graph structures are identical, while a score of 1 indicates that they are completely different. Subsequently, we transition to the SSR-RL phase using the verl framework. We adopt a sequential tr...