pith. machine review for the scientific record. sign in

arxiv: 2605.02452 · v1 · submitted 2026-05-04 · 💻 cs.AI

Recognition: 1 theorem link

Position: How can Graphs Help Large Language Models?

Authors on Pith no claims yet

Pith reviewed 2026-05-08 18:42 UTC · model grok-4.3

classification 💻 cs.AI
keywords graphslarge language modelshallucinationspromptingknowledge graphsstructured datareasoninggraph neural networks
0
0 comments X

The pith

Graphs can reduce hallucinations in large language models by serving as current knowledge sources and strengthen reasoning through structured prompting methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper poses the reverse of the common question about LLMs aiding graph tasks and instead examines how graphs can support LLMs. It identifies three concrete mechanisms: graphs as dynamic knowledge bases that supply fresh facts to curb factual errors, graph-organized prompting that guides step-by-step thinking, and direct incorporation of graph structures that lets models handle relational information more naturally. These approaches matter because LLMs frequently produce plausible but incorrect outputs and struggle with data that has explicit connections rather than linear text. The authors also sketch future model designs that might use graphs for sparsity and memory organization.

Core claim

Graphs help large language models by acting as up-to-date knowledge sources that reduce hallucinations, by enabling prompting techniques such as Chain-of-Thought, Tree-of-Thought, and Graph-of-Thought that improve reasoning, and by allowing integration of graph structures that extends LLM applicability to structured domains including e-commerce, code, and relational databases.

What carries the argument

The three perspectives of graph assistance: knowledge sourcing to combat hallucinations, graph-based prompting for reasoning, and structural integration for relational data.

If this is right

  • LLMs can maintain accuracy on time-sensitive facts without full retraining.
  • Reasoning tasks that involve branching or relational paths become more tractable.
  • Models gain native support for domains that rely on tables, graphs, or code dependencies.
  • Future LLM designs may adopt sparse graph connections to lower compute needs.
  • Memory systems modeled on brain-like graph structures could emerge.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hybrid LLM-graph systems could set new standards for reliability in knowledge-intensive applications.
  • Benchmarks that jointly test textual fluency and structural consistency may become necessary.
  • Graph integration might allow smaller models to match larger ones on tasks that benefit from explicit relations.

Load-bearing premise

The assumption that adding graph components will produce clear performance gains in LLMs without creating new engineering burdens or unforeseen limitations.

What would settle it

A controlled test on factual question-answering benchmarks that shows no measurable drop in hallucination rate when LLMs are given access to an up-to-date graph knowledge source versus text-only retrieval.

read the original abstract

With the rapid advancement of large language models (LLMs), classic graph learning tasks have greatly benefited from LLMs, including improved encoding of textual features, more efficient construction of graphs from text, and enhanced reasoning over knowledge graphs. In this paper, we ask a complementary question: How can graphs help LLMs? We address this question from three perspectives: 1) graphs provide an up-to-date knowledge source that helps reduce LLM hallucinations, 2) graph-based prompting techniques-such as Chain-of-Thought (CoT), Tree-of-Thought (ToT), and Graph-of-Thought (GoT)-enhance LLM reasoning capabilities, and 3) integrating graphs into LLMs improves their understanding of structured data, expanding their applicability to domains such as e-commerce, code, and relational databases (RDBs). We further outlook some future directions including designing sparse LLM architectures based on graphs and brain-inspired memory systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript is a position paper posing the complementary question of how graphs can help large language models (LLMs), in contrast to the more common use of LLMs to aid graph tasks. It addresses this via three forward-looking perspectives: (1) graphs as an up-to-date knowledge source to reduce hallucinations, (2) graph-based prompting techniques (e.g., Chain-of-Thought, Tree-of-Thought, Graph-of-Thought) to enhance reasoning, and (3) integration of graphs into LLMs to improve structured data understanding and expand applicability to domains like e-commerce, code, and relational databases. The paper concludes with an outlook on future directions including sparse graph-based LLM architectures and brain-inspired memory systems.

Significance. If the perspectives hold and are pursued in follow-on work, the paper could help steer research toward hybrid graph-LLM systems that address key LLM limitations in knowledge freshness and structured reasoning. Its primary contribution is the clear framing of a complementary research agenda rather than any new empirical results or formal derivations; this framing itself has value in highlighting underexplored synergies.

minor comments (3)
  1. [Abstract] The abstract introduces 'Graph-of-Thought (GoT)' without a brief definition or citation, which reduces accessibility for readers new to the prompting literature.
  2. [Perspectives] In the discussion of the three perspectives, the mechanisms (e.g., how graphs are dynamically updated or retrieved to mitigate hallucinations) are described at a high level only; adding one concrete illustrative example per perspective would strengthen the exposition without altering the position-paper nature.
  3. [Outlook] The future-directions paragraph lists sparse architectures and brain-inspired memory but does not identify any concrete open challenges or evaluation metrics that would help readers design follow-up experiments.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our position paper and the recommendation for minor revision. We are pleased that the complementary framing of how graphs can aid LLMs is viewed as a valuable contribution to steering future research on hybrid systems.

Circularity Check

0 steps flagged

No significant circularity; position paper with no derivations

full rationale

This is a position paper that poses a complementary question and addresses it through three discursive perspectives plus an outlook on future directions. It contains no equations, formal derivations, fitted parameters, or performance claims that could reduce to self-referential inputs. The perspectives are framed as potential benefits drawn from external concepts rather than proven results, and no load-bearing self-citations or uniqueness theorems are invoked to force conclusions. The manuscript is self-contained against external benchmarks as a forward-looking discussion without any circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The position rests on domain assumptions about the effectiveness of graph-LLM synergies that are not tested or derived within the paper itself.

axioms (1)
  • domain assumption Graphs can serve as reliable, up-to-date knowledge sources that reduce LLM hallucinations when integrated appropriately.
    Invoked in the first perspective without supporting evidence or mechanisms detailed in the abstract.

pith-pipeline@v0.9.0 · 5458 in / 1128 out tokens · 62268 ms · 2026-05-08T18:42:54.474729+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

113 extracted references · 45 canonical work pages · 4 internal anchors

  1. [1]

    Institute for Artificial Intelligence, Peking University, Beijing 100871, China

  2. [2]

    Position: How can Graphs Help Large Language Models?

    School of Computer Sciences, Beijing University of Posts and Telecommunications, Beijing 100876, China Received month dd, yyyy; accepted month dd, yyyy E-mail: muhan@pku.edu.cn. * These authors contributed equally to this work. ©Higher Education Press 2026 Abstract With the rapid advancement of large language models (LLMs), classic graph learning tasks ha...

  3. [3]

    Another line of research proposes post-training LLMs using instruction tuning [77,83,84] or preference tuning [85,86] on graph problems

    and linearization orders [81, 82] have generally resulted in only modest improvements. Another line of research proposes post-training LLMs using instruction tuning [77,83,84] or preference tuning [85,86] on graph problems. These methods achieve good performance on problems related to basic graph structural properties. •Graph as Embedding A more effective...

  4. [4]

    A survey of graph meets large language model: Progress and future directions.arXiv preprint arXiv:2311.12399, 2023

    Li Y, Li Z, Wang P, Li J, Sun X, Cheng H, Yu J X. A survey of graph meets large language model: Progress and future directions. arXiv preprint arXiv:2311.12399, 2023

  5. [5]

    Exploring the potential of large language models (llms) in learning on graphs, 2025

    Chen Z, Mao H, Li H, Jin W, Wen H, Wei X, Wang S, Yin D, Fan W, Liu H, Tang J. Exploring the potential of large language models (llms) in learning on graphs, 2025

  6. [6]

    Graph machine learning in the era of large language models (llms)

    Wang S, Huang J, Chen Z, Song Y, Tang W, Mao H, Fan W, Liu H, Liu X, Yin D, others . Graph machine learning in the era of large language models (llms). ACM Transactions on Intelligent Systems and Technology, 2025, 16(5): 1–40

  7. [7]

    Large language models on graphs: A comprehensive survey

    Jin B, Liu G, Han C, Jiang M, Ji H, Han J. Large language models on graphs: A comprehensive survey. IEEE Transactions on Knowledge and Data Engineering, 2024

  8. [8]

    A survey of large lan- guage models for graphs

    Ren X, Tang J, Yin D, Chawla N, Huang C. A survey of large lan- guage models for graphs. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2024, 6616– 6626

  9. [9]

    Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities

    Zhu Y, Wang X, Chen J, Qiao S, Ou Y, Yao Y, Deng S, Chen H, Zhang N. Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities. World Wide Web, 2024, 27(5): 58

  10. [10]

    Unifying large language models and knowledge graphs: A roadmap

    Pan S, Luo L, Wang Y, Chen C, Wang J, Wu X. Unifying large language models and knowledge graphs: A roadmap. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(7): 3580– 3599

  11. [11]

    Combining knowledge graphs and large language models

    Kau A, He X, Nambissan A, Astudillo A, Yin H, Aryani A. Combining knowledge graphs and large language models. arXiv preprint arXiv:2407.06564, 2024

  12. [12]

    K- bert: Enabling language representation with knowledge graph

    Liu W, Zhou P, Zhao Z, Wang Z, Ju Q, Deng H, Wang P. K- bert: Enabling language representation with knowledge graph. In: Proceedings of the AAAI conference on artificial intelligence. 2020, 2901–2908

  13. [13]

    Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model, 2019

    Xiong W, Du J, Wang W Y, Stoyanov V. Pretrained encyclope- dia: Weakly supervised knowledge-pretrained language model. arXiv preprint arXiv:1912.09637, 2019

  14. [14]

    Colake: Contextualized language and knowledge embedding

    Sun T, Shao Y, Qiu X, Guo Q, Hu Y, Huang X, Zhang Z. Colake: Contextualized language and knowledge embedding. arXiv preprint arXiv:2010.00309, 2020

  15. [15]

    Ex- ploiting structured knowledge in text via graph-guided representation learning

    Shen T, Mao Y, He P, Long G, Trischler A, Chen W. Exploiting structured knowledge in text via graph-guided representation learning. arXiv preprint arXiv:2004.14224, 2020

  16. [16]

    Dkplm: decomposable knowledge-enhanced pre-trained language model for FrontiersofComputer Science|Issue 0|Volume 0|January 2026|1–7 Xiyuan Wang et al

    Zhang T, Wang C, Hu N, Qiu M, Tang C, He X, Huang J. Dkplm: decomposable knowledge-enhanced pre-trained language model for FrontiersofComputer Science|Issue 0|Volume 0|January 2026|1–7 Xiyuan Wang et al. Position: How can Graphs Help Large Language Models? natural language understanding. In: Proceedings of the AAAI Confer- ence on Artificial Intelligence....

  17. [17]

    Ernie: Enhanced language representation with informative entities

    Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q. Ernie: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129, 2019

  18. [18]

    Integrat- ing graph contextualized knowledge into pre-trained language models

    He B, Zhou D, Xiao J, Liu Q, Yuan N J, Xu T, others . Integrat- ing graph contextualized knowledge into pre-trained language models. arXiv preprint arXiv:1912.00147, 2019

  19. [19]

    Deep bidirectional language-knowledge graph pre- training

    Yasunaga M, Bosselut A, Ren H, Zhang X, Manning C D, Liang P S, Leskovec J. Deep bidirectional language-knowledge graph pre- training. Advances in Neural Information Processing Systems, 2022, 35: 37309–37323

  20. [20]

    Klmo: Knowledge graph en- hanced pretrained language model with fine-grained relationships

    He L, Zheng S, Yang T, Zhang F. Klmo: Knowledge graph en- hanced pretrained language model with fine-grained relationships. In: Findings of the Association for Computational Linguistics: EMNLP

  21. [21]

    Knowledge enhanced contextual word represen- tations

    Peters M E, Neumann M, Logan IV R L, Schwartz R, Joshi V, Singh S, Smith N A. Knowledge enhanced contextual word represen- tations. arXiv preprint arXiv:1909.04164, 2019

  22. [22]

    Jaket: Joint pre-training of knowledge graph and language understanding

    Yu D, Zhu C, Yang Y, Zeng M. Jaket: Joint pre-training of knowledge graph and language understanding. In: Proceedings of the AAAI conference on artificial intelligence. 2022, 11630–11638

  23. [23]

    Trelm: Towards robust and efficient pre-training for knowledge- enhanced language models

    Yan J, Wang C, Zhang T, He X, Huang J, Huang L, Xue H, Zhang W. Trelm: Towards robust and efficient pre-training for knowledge- enhanced language models. arXiv preprint arXiv:2403.11203, 2024

  24. [24]

    Greaselm: Graph reasoning enhanced language models for question answering

    Zhang X, Bosselut A, Yasunaga M, Ren H, Liang P, Manning C D, Leskovec J. Greaselm: Graph reasoning enhanced language models for question answering. arXiv preprint arXiv:2201.08860, 2022

  25. [25]

    Jointlk: Joint reasoning with language models and knowledge graphs for commonsense question answering

    Sun Y, Shi Q, Qi L, Zhang Y. Jointlk: Joint reasoning with language models and knowledge graphs for commonsense question answering. arXiv preprint arXiv:2112.02732, 2021

  26. [26]

    Gap: A graph-aware language model framework for knowledge graph-to-text generation

    Colas A, Alvandipour M, Wang D Z. Gap: A graph-aware language model framework for knowledge graph-to-text generation. arXiv preprint arXiv:2204.06674, 2022

  27. [27]

    K-adapter: Infusing knowledge into pre-trained models with adapters

    Wang R, Tang D, Duan N, Wei Z, Huang X, Cao G, Jiang D, Zhou M, others . K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808, 2020

  28. [28]

    Kg- adapter: Enabling knowledge graph integration in large language mod- els through parameter-efficient fine-tuning

    Tian S, Luo Y, Xu T, Yuan C, Jiang H, Wei C, Wang X. Kg- adapter: Enabling knowledge graph integration in large language mod- els through parameter-efficient fine-tuning. In: Findings of the Asso- ciation for Computational Linguistics ACL 2024. 2024, 3813–3828

  29. [30]

    Lego-graphrag: Modularizing graph-based retrieval-augmented generation for design space exploration

    Cao Y, Gao Z, Li Z, Xie X, Zhou K, Xu J. Lego-graphrag: Modularizing graph-based retrieval-augmented generation for design space exploration. arXiv preprint arXiv:2411.05844, 2024

  30. [31]

    LightRAG: Simple and Fast Retrieval-Augmented Generation

    Guo Z, Xia L, Yu Y, Ao T, Huang C. Lightrag: Simple and fast retrieval-augmented generation. arXiv preprint arXiv:2410.05779, 2024

  31. [32]

    Costas Mavromatis and George Karypis

    Ma S, Xu C, Jiang X, Li M, Qu H, Yang C, Mao J, Guo J. Think-on-graph 2.0: Deep and faithful large language model reasoning with knowledge-guided retrieval augmented generation. arXiv preprint arXiv:2407.10805, 2024

  32. [33]

    Hy- bridrag: Integrating knowledge graphs and vector retrieval augmented generation for efficient information extraction

    Sarmah B, Mehta D, Hall B, Rao R, Patel S, Pasquali S. Hy- bridrag: Integrating knowledge graphs and vector retrieval augmented generation for efficient information extraction. In: Proceedings of the 5th ACM International Conference on AI in Finance. 2024, 608–616

  33. [34]

    G-retriever: Retrieval-augmented generation for textual graph understanding and question answering

    He X, Tian Y, Sun Y, Chawla N, Laurent T, LeCun Y, Bresson X, Hooi B. G-retriever: Retrieval-augmented generation for textual graph understanding and question answering. Advances in Neural Information Processing Systems, 2024, 37: 132876–132907

  34. [35]

    Simple is effective: The roles of graphs and large language models in knowledge-graph-based retrieval-augmented generation.arXiv preprint arXiv:2410.20724, 2024

    Li M, Miao S, Li P. Simple is effective: The roles of graphs and large language models in knowledge-graph-based retrieval-augmented generation. arXiv preprint arXiv:2410.20724, 2024

  35. [36]

    Aligning llms for the classroom with knowledge-based retrieval–a comparative rag study

    Jain A, Cui L, Chen S. Aligning llms for the classroom with knowledge-based retrieval–a comparative rag study. arXiv preprint arXiv:2509.07846, 2025

  36. [37]

    Tri-Graph

    Zhuang L, Chen S, Xiao Y, Zhou H, Zhang Y, Chen H, Zhang Q, Huang X. Linearrag: Linear graph retrieval augmented generation on large-scale corpora. arXiv preprint arXiv:2510.10114, 2025

  37. [38]

    Erarag: Efficient and incremental retrieval augmented generation for growing corpora,

    Zhang F, Huang Z, Zhou Y, Guo Q, Li Z, Luo W, Jiang D, Fang Y, Zhou X. Erarag: Efficient and incremental retrieval augmented generation for growing corpora. arXiv preprint arXiv:2506.20963, 2025

  38. [39]

    Subgcache: Accel- erating graph-based rag with subgraph-level kv cache

    Zhu Q, Zhang L, Xu Q, Long C, Zhang J. Subgcache: Accel- erating graph-based rag with subgraph-level kv cache. arXiv preprint arXiv:2505.10951, 2025

  39. [40]

    Grapheval: A knowledge-graph based llm hallucination evaluation framework,

    Sansford H, Richardson N, Maretic H P, Saada J N. Grapheval: A knowledge-graph based llm hallucination evaluation framework. arXiv preprint arXiv:2407.10793, 2024

  40. [41]

    Zero-resource hallucination detection for text generation via graph-based contextual knowledge triples modeling

    Fang X, Huang Z, Tian Z, Fang M, Pan Z, Fang Q, Wen Z, Pan H, Li D. Zero-resource hallucination detection for text generation via graph-based contextual knowledge triples modeling. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2025, 23868–23877

  41. [42]

    Evaluating the factuality of large language models using large-scale knowledge graphs

    Liu X, Wu F, Xu T, Chen Z, Zhang Y, Wang X, Gao J. Evaluating the factuality of large language models using large-scale knowledge graphs. arXiv preprint arXiv:2404.00942, 2024

  42. [43]

    Mitigat- ing large language model hallucinations via autonomous knowledge graph-based retrofitting

    Guan X, Liu Y, Lin H, Lu Y, He B, Han X, Sun L. Mitigat- ing large language model hallucinations via autonomous knowledge graph-based retrofitting. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 18126–18134

  43. [44]

    Mitigating hallucinations in large language models via self-refinement-enhanced knowledge re- trieval

    Niu M, Li H, Shi J, Haddadi H, Mo F. Mitigating hallucinations in large language models via self-refinement-enhanced knowledge re- trieval. arXiv preprint arXiv:2405.06545, 2024

  44. [45]

    Trustful llms: Customizing and grounding text generation with knowledge bases and dual decoders

    Zhu X, Mandivarapu J K. Trustful llms: Customizing and grounding text generation with knowledge bases and dual decoders. arXiv preprint arXiv:2411.07870, 2024

  45. [46]

    Reducing hallucinations in language model-based sparql query generation using post-generation memory retrieval

    Sharma A, Lara L, Pal C J, Zouaq A. Reducing hallucinations in language model-based sparql query generation using post-generation memory retrieval. arXiv preprint arXiv:2502.13369, 2025

  46. [47]

    Barkley L, Merwe v. d B. Investigating the role of prompting and FrontiersofComputer Science|Issue 0|Volume 0|January 2026|1–8 Front. Comput. Sci., 2026, 0(0): 1 external tools in hallucination rates of large language models. arXiv preprint arXiv:2410.19385, 2024

  47. [48]

    Chain-of-thought prompting elicits reasoning in large language models

    Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le Q V, Zhou D, others . Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 2022, 35: 24824–24837

  48. [49]

    Large lan- guage models are zero-shot reasoners

    Kojima T, Gu S S, Reid M, Matsuo Y, Iwasawa Y. Large lan- guage models are zero-shot reasoners. Advances in neural information processing systems, 2022, 35: 22199–22213

  49. [50]

    Tree of thoughts: Deliberate problem solving with large language models

    Yao S, Yu D, Zhao J, Shafran I, Griffiths T, Cao Y, Narasimhan K. Tree of thoughts: Deliberate problem solving with large language models. Advances in neural information processing systems, 2023, 36: 11809–11822

  50. [51]

    Graph of thoughts: Solving elaborate problems with large language models

    Besta M, Blach N, Kubicek A, Gerstenberger R, Podstawski M, Gianinazzi L, Gajda J, Lehmann T, Niewiadomski H, Nyczyk P, others . Graph of thoughts: Solving elaborate problems with large language models. In: Proceedings of the AAAI conference on artificial intelligence. 2024, 17682–17690

  51. [52]

    Everything of thoughts: Defying the law of penrose triangle for thought generation

    Ding R, Zhang C, Wang L, Xu Y, Ma M, Zhang W, Qin S, Rajmohan S, Lin Q, Zhang D. Everything of thoughts: Defying the law of penrose triangle for thought generation. In: Findings of the Association for Computational Linguistics: ACL 2024. 2024, 1638– 1662

  52. [53]

    Alphazero-like tree-search can guide large language model decoding and training

    Wan Z, Feng X, Wen M, Mcaleer S M, Wen Y, Zhang W, Wang J. Alphazero-like tree-search can guide large language model decoding and training. In: International Conference on Machine Learning. 2024, 49890–49920

  53. [54]

    Mutual reasoning makes smaller llms stronger problem-solver

    Qi Z, Mingyuan M, Xu J, Zhang L L, Yang F, Yang M. Mutual reasoning makes smaller llms stronger problem-solver. In: The Thir- teenth International Conference on Learning Representations. 2024

  54. [55]

    rstar-math: Small llms can master math reasoning with self- evolved deep thinking

    Guan X, Zhang L L, Liu Y, Shang N, Sun Y, Zhu Y, Yang F, Yang M. rstar-math: Small llms can master math reasoning with self- evolved deep thinking. In: Forty-second International Conference on Machine Learning. 2025

  55. [56]

    Rest-mcts*: Llm self-training via process reward guided tree search

    Zhang D, Zhoubian S, Hu Z, Yue Y, Dong Y, Tang J. Rest-mcts*: Llm self-training via process reward guided tree search. Advances in Neural Information Processing Systems, 2024, 37: 64735–64772

  56. [57]

    Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B,

    Zhang D, Huang X, Zhou D, Li Y, Ouyang W. Accessing gpt-4 level mathematical olympiad solutions via monte carlo tree self-refine with llama-3 8b. arXiv preprint arXiv:2406.07394, 2024

  57. [58]

    Toward self-improvement of llms via imagination, searching, and crit- icizing

    Tian Y, Peng B, Song L, Jin L, Yu D, Han L, Mi H, Yu D. Toward self-improvement of llms via imagination, searching, and crit- icizing. Advances in Neural Information Processing Systems, 2024, 37: 52723–52748

  58. [59]

    Self-Consistency Improves Chain of Thought Reasoning in Language Models

    Wang X, Wei J, Schuurmans D, Le Q, Chi E, Narang S, Chowd- hery A, Zhou D. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171, 2022

  59. [60]

    Demystifying chains, trees, and graphs of thoughts

    Besta M, Memedi F, Zhang Z, Gerstenberger R, Piao G, Blach N, Nyczyk P, Copik M, Kwa´sniewski G, M¨ uller J, others . Demystifying chains, trees, and graphs of thoughts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  60. [61]

    Unleashing the potential of prompt engineering for large language models

    Chen B, Zhang Z, Langren ´e N, Zhu S. Unleashing the potential of prompt engineering for large language models. Patterns, 2025

  61. [62]

    Pal: Program-aided language models

    Gao L, Madaan A, Zhou S, Alon U, Liu P, Yang Y, Callan J, Neubig G. Pal: Program-aided language models. In: International Conference on Machine Learning. 2023, 10764–10799

  62. [63]

    Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks

    Chen W, Ma X, Wang X, Cohen W W. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588, 2022

  63. [64]

    Code prompting: a neural symbolic method for complex reasoning in large language models

    Hu Y, Yang H, Lin Z, Zhang M. Code prompting: a neural symbolic method for complex reasoning in large language models. arXiv preprint arXiv:2305.18507, 2023

  64. [65]

    Case-based or rule-based: How do transformers do the math? In: International Conference on Machine Learning

    Hu Y, Tang X, Yang H, Zhang M. Case-based or rule-based: How do transformers do the math? In: International Conference on Machine Learning. 2024, 19438–19474

  65. [66]

    arXiv preprint arXiv:2502.11525 , year=

    Hu Y, Kang S, Yang H, Xu H, Zhang M. Beyond single- task: Robust multi-task length generalization for llms. arXiv preprint arXiv:2502.11525, 2025

  66. [67]

    Graph-enhanced large language models in asynchronous plan reasoning

    Lin F, La Malfa E, Hofmann V, Yang E M, Cohn A G, Pierre- humbert J B. Graph-enhanced large language models in asynchronous plan reasoning. In: International Conference on Machine Learning. 2024, 30108–30134

  67. [68]

    Can graph learning improve planning in llm-based agents? Advances in Neural Information Processing Systems, 2024, 37: 5338–5383

    Wu X, Shen Y, Shan C, Song K, Wang S, Zhang B, Feng J, Cheng H, Chen W, Xiong Y, others . Can graph learning improve planning in llm-based agents? Advances in Neural Information Processing Systems, 2024, 37: 5338–5383

  68. [69]

    Benchmarking agentic workflow generation

    Qiao S, Fang R, Qiu Z, Wang X, Zhang N, Jiang Y, Xie P, Huang F, Chen H. Benchmarking agentic workflow generation. In: The Thirteenth International Conference on Learning Representations. 2025

  69. [70]

    Scaling large language model-based multi- agent collaboration

    Qian C, Xie Z, Wang Y, Liu W, Zhu K, Xia H, Dang Y, Du Z, Chen W, Yang C, others . Scaling large language model-based multi- agent collaboration. In: The Thirteenth International Conference on Learning Representations. 2025

  70. [71]

    Gptswarm: Language agents as optimizable graphs

    Zhuge M, Wang W, Kirsch L, Faccio F, Khizbullin D, Schmid- huber J. Gptswarm: Language agents as optimizable graphs. In: Forty-first International Conference on Machine Learning. 2024

  71. [72]

    Cut the crap: An economical communication pipeline for llm-based multi-agent systems.arXiv preprint arXiv:2410.02506, 2024

    Zhang G, Yue Y, Li Z, Yun S, Wan G, Wang K, Cheng D, Yu J X, Chen T. Cut the crap: An economical communication pipeline for llm-based multi-agent systems. arXiv preprint arXiv:2410.02506, 2024

  72. [73]

    G-designer: Architecting multi-agent communication topologies via graph neural networks.arXiv preprint arXiv:2410.11782, 2024

    Zhang G, Yue Y, Sun X, Wan G, Yu M, Fang J, Wang K, Chen T, Cheng D. G-designer: Architecting multi-agent com- munication topologies via graph neural networks. arXiv preprint arXiv:2410.11782, 2024

  73. [74]

    Can language models solve graph problems in natural language? In: NeurIPS

    Wang H, Feng S, He T, Tan Z, Han X, Tsvetkov Y. Can language models solve graph problems in natural language? In: NeurIPS. 2023

  74. [75]

    Grapharena: Benchmarking large language models on graph computational problems

    Tang J, Zhang Q, Li Y, Li J. Grapharena: Benchmarking large language models on graph computational problems. In: ICLR. 2025

  75. [76]

    Gracore: Benchmarking graph comprehension and complex reasoning in large language models

    Yuan Z, Liu M, Wang H, Qin B. Gracore: Benchmarking graph comprehension and complex reasoning in large language models. In: COLING. 2025

  76. [77]

    Grapheval2000: Benchmarking and improving large language models on graph datasets

    Wu Q, Chen Z, Corcoran W, Sra M, Singh A K. Grapheval2000: Benchmarking and improving large language models on graph datasets. FrontiersofComputer Science|Issue 0|Volume 0|January 2026|1–9 Xiyuan Wang et al. Position: How can Graphs Help Large Language Models? Technical report, 2024

  77. [78]

    Can large language models analyze graphs like professionals? a benchmark, datasets and models

    Li X, Chen W, Chu Q, Li H, Sun Z, Li R, Qian C, Wei Y, Shi C, Liu Z, others . Can large language models analyze graphs like professionals? a benchmark, datasets and models. In: NeurIPS. 2024

  78. [79]

    Graphomni: A comprehensive and extendable benchmark framework for large language models on graph- theoretic tasks

    Xu H, Jian X, Zhao X, Pang W, Zhang C, Wang S, Zhang Q, Monteiro J, Sun Q, Yu T. Graphomni: A comprehensive and extendable benchmark framework for large language models on graph- theoretic tasks. 2025

  79. [80]

    G1: Teaching llms to reason on graphs with reinforcement learning

    Guo X, Li A, Wang Y, Jegelka S, Wang Y. G1: teaching llms to reason on graphs with reinforcement learning. CoRR, 2025, abs/2505.18499

  80. [81]

    Evaluating large language models on graphs: Performance insights and comparative analysis, 2023

    Liu C, Wu B. Evaluating large language models on graphs: Performance insights and comparative analysis, 2023

Showing first 80 references.