pith. sign in

arxiv: 2502.09891 · v4 · submitted 2025-02-14 · 💻 cs.IR · cs.AI

ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation

Pith reviewed 2026-05-23 03:25 UTC · model grok-4.3

classification 💻 cs.IR cs.AI
keywords Retrieval-Augmented GenerationGraph RAGHierarchical ClusteringAttributed CommunitiesQuestion AnsweringToken EfficiencyLarge Language ModelsInformation Retrieval
0
0 comments X

The pith

ArchRAG retrieves from attributed communities in a hierarchical graph index to raise RAG accuracy and cut token use.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ArchRAG to fix two problems in graph-based retrieval-augmented generation: missing relevant details from the graph and spending too many tokens during retrieval. It augments each question with attributed communities formed by an LLM-based hierarchical clustering process, then indexes those communities in a layered structure that supports fast online selection of only the needed parts. If the method holds, question-answering systems can draw on rich graph relationships more precisely without inflating context length. A reader would care because many current approaches either overlook key entity links or overload the model with excess text at inference time.

Core claim

ArchRAG augments the question using attributed communities, introduces an LLM-based hierarchical clustering method to create those communities from the graph, builds a novel hierarchical index structure over the communities, and develops an effective online retrieval method, which together allow more accurate identification of relevant information while consuming fewer tokens than prior graph RAG systems.

What carries the argument

Attributed community, a group of graph entities sharing semantic attributes and link relationships, arranged in a hierarchical index that supports layered retrieval.

If this is right

  • Graph data can be used more efficiently in QA because only relevant communities reach the model.
  • Token consumption drops during online retrieval because the hierarchical index prunes irrelevant sections early.
  • Existing graph RAG methods that lack community structure or layering can be outperformed on both accuracy and cost.
  • The same indexing approach can scale to larger graphs by keeping retrieval local to the relevant hierarchy levels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The technique might generalize to non-graph structured data if similar community detection can be applied to tables or trees.
  • Targeted context from communities could lower the rate of unsupported claims in generated answers.
  • Dynamic graphs might require periodic re-clustering, which could be tested as an extension.

Load-bearing premise

The LLM clustering step forms communities that stay both semantically coherent and directly relevant to the input question without introducing retrieval errors or extra token overhead.

What would settle it

Run ArchRAG and a baseline on the same graph QA dataset, then replace the LLM clustering with random partitioning and measure whether accuracy falls or token count rises.

Figures

Figures reproduced from arXiv: 2502.09891 by Shu Wang, Xilin Liu, Yingli Zhou, Yixiang Fang, Yuchi Ma.

Figure 1
Figure 1. Figure 1: The general workflow of graph-based RAG, which [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: ArchRAG consists of two phases: offline indexing and online retrieval. For the online retrieval phase, we show an [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Head-to-head win rates for abstract QA, comparing each row method against each column (higher is better). VR, LR, [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of query efficiency. • Efficiency of ArchRAG. We compare the time cost and token usage of ArchRAG with those of other baseline meth￾ods. As shown in [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: C-HNSW and Base-HNSW query efficiency. We also conducted experiments on a synthetic dataset of 1024-dimensional vectors, keeping all other parameters un￾changed. The results are shown in [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: C-HNSW and Base-HNSW query efficiency on [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of indexing efficiency [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Community quality evaluated by CH Index. [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Community quality evaluated by Cosine Similar [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparative analysis of the different numbers of [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Case study of responses by different RAG methods on a question from the Multihop-RAG dataset. [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: ArchRAG Retrieval & Filtering Output. ... marks omitted irrelevant parts [PITH_FULL_IMAGE:figures/full_fig_p017_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: The prompt for generating abstract questions. [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: The prompt for the evaluation of abstract QA. [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: The prompt for adaptive filtering-based generation. [PITH_FULL_IMAGE:figures/full_fig_p020_15.png] view at source ↗
read the original abstract

Retrieval-Augmented Generation (RAG) has proven effective in integrating external knowledge into large language models (LLMs) for solving question-answer (QA) tasks. The state-of-the-art RAG approaches often use the graph data as the external data since they capture the rich semantic information and link relationships between entities. However, existing graph-based RAG approaches cannot accurately identify the relevant information from the graph and also consume large numbers of tokens in the online retrieval process. To address these issues, we introduce a novel graph-based RAG approach, called Attributed Community-based Hierarchical RAG (ArchRAG), by augmenting the question using attributed communities, and also introducing a novel LLM-based hierarchical clustering method. To retrieve the most relevant information from the graph for the question, we build a novel hierarchical index structure for the attributed communities and develop an effective online retrieval method. Experimental results demonstrate that ArchRAG outperforms existing methods in both accuracy and token cost.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes ArchRAG, a graph-based RAG framework for QA tasks. It augments questions using attributed communities derived via a novel LLM-based hierarchical clustering method on graph data, constructs a hierarchical index over these communities, and performs targeted online retrieval. The central claim is that this yields higher accuracy and lower token cost than prior graph RAG methods.

Significance. If the empirical superiority holds under rigorous controls, the work would offer a practical advance in reducing token overhead while improving relevance in graph-augmented LLM systems, addressing two recurring pain points in current RAG literature.

major comments (2)
  1. [§4] §4 (Experiments): the abstract and introduction assert outperformance on accuracy and token cost, yet no quantitative results, baselines, datasets, or statistical controls are referenced in the provided front matter; the central empirical claim therefore cannot be evaluated without the full experimental tables and methodology details.
  2. [§3.2] §3.2 (LLM-based hierarchical clustering): the method is load-bearing for the attributed-community construction, yet the description supplies no formal definition of community attribution, no guarantee against semantic drift across hierarchy levels, and no ablation isolating its contribution to the reported gains; this leaves the weakest assumption untested.
minor comments (2)
  1. [§3] Notation for the hierarchical index structure should be introduced with a single diagram or pseudocode block early in §3 to avoid repeated textual descriptions.
  2. [§3.3] The online retrieval algorithm description would benefit from a complexity analysis (time and token) to make the claimed efficiency gains explicit rather than qualitative.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment point by point below, indicating the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§4] §4 (Experiments): the abstract and introduction assert outperformance on accuracy and token cost, yet no quantitative results, baselines, datasets, or statistical controls are referenced in the provided front matter; the central empirical claim therefore cannot be evaluated without the full experimental tables and methodology details.

    Authors: The full manuscript contains Section 4 with the complete experimental details, including the specific datasets used, the baseline methods compared, quantitative tables reporting accuracy and token-cost metrics, and the statistical controls applied. To address the concern about traceability from the front matter, we will revise the introduction to include explicit forward references to the relevant tables and figures in Section 4. revision: yes

  2. Referee: [§3.2] §3.2 (LLM-based hierarchical clustering): the method is load-bearing for the attributed-community construction, yet the description supplies no formal definition of community attribution, no guarantee against semantic drift across hierarchy levels, and no ablation isolating its contribution to the reported gains; this leaves the weakest assumption untested.

    Authors: We agree that the current description of the LLM-based hierarchical clustering can be strengthened. In the revised manuscript we will add a formal mathematical definition of community attribution. We will also include a short analysis (or empirical check) of semantic consistency across hierarchy levels to address potential drift. Finally, we will add an ablation study that isolates the contribution of the hierarchical clustering component to the observed accuracy and token-cost improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents an empirical system description of ArchRAG, a graph-based RAG pipeline using LLM hierarchical clustering into attributed communities, a hierarchical index, question augmentation, and online retrieval. No equations, fitted parameters, predictions derived from inputs, or self-citation chains appear in the abstract or described structure. The central claim of outperformance on accuracy and token cost is an experimental result, not a derivation that reduces to its own inputs by construction. This is the expected non-finding for a constructive engineering paper without mathematical self-reference.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, no fitted constants, and no explicit background assumptions; the ledger is therefore empty.

pith-pipeline@v0.9.0 · 5699 in / 1056 out tokens · 21007 ms · 2026-05-23T03:25:31.562086+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. EHRAG: Bridging Semantic Gaps in Lightweight GraphRAG via Hybrid Hypergraph Construction and Retrieval

    cs.AI 2026-04 unverdicted novelty 6.0

    EHRAG constructs structural hyperedges from sentence co-occurrence and semantic hyperedges from entity embedding clusters, then applies hybrid diffusion plus topic-aware PPR to retrieve top-k documents, outperforming ...

  2. MG$^2$-RAG: Multi-Granularity Graph for Multimodal Retrieval-Augmented Generation

    cs.IR 2026-04 unverdicted novelty 6.0

    MG²-RAG proposes a multi-granularity graph RAG framework that constructs hierarchical multimodal nodes via entity-driven visual grounding and performs structured retrieval, delivering SOTA results on four multimodal t...

  3. Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

    cs.LG 2026-05 unverdicted novelty 5.0

    Ψ-RAG improves cross-document multi-hop QA performance using an adaptive hierarchical abstract tree and agent-powered hybrid retrieval, outperforming RAPTOR by 25.9% and HippoRAG 2 by 7.4% in average F1.

  4. G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge

    cs.AI 2025-09 unverdicted novelty 5.0

    G-reasoner uses QuadGraph abstraction and a 34M-parameter graph foundation model integrated with LLMs to enable scalable reasoning over diverse graph-structured knowledge, outperforming baselines on six benchmarks.

  5. LLM+Graph@VLDB'2025 Workshop Summary

    cs.DB 2026-04 unverdicted novelty 1.0

    The report summarizes key research directions, challenges, and solutions from the LLM+Graph workshop at VLDB 2025.

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages · cited by 5 Pith papers · 15 internal anchors

  1. [1]

    , " * write output.state after.block = add.period write newline

    ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

  2. [2]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

  3. [3]

    Angelidis, S.; and Lapata, M. 2018. Summarizing opinions: Aspect extraction meets sentiment prediction and they are both weakly supervised. arXiv preprint arXiv:1808.08858

  4. [4]

    Asai, A.; Wu, Z.; Wang, Y.; Sil, A.; and Hajishirzi, H. 2023. Self-rag: Learning to retrieve, generate, and critique through self-reflection. arXiv preprint arXiv:2310.11511

  5. [5]

    Brown, T. B. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165

  6. [6]

    Cali \'n ski, T.; and Harabasz, J. 1974. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1): 1--27

  7. [7]

    Cao, Y.; Han, S.; Gao, Z.; Ding, Z.; Xie, X.; and Zhou, S. K. 2024. Graphinsight: Unlocking insights in large language models for graph structure understanding. arXiv preprint arXiv:2409.03258

  8. [8]

    Charikar, M. S. 2002. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, 380--388

  9. [9]

    Chen, N.; Li, Y.; Tang, J.; and Li, J. 2024 a . Graphwiz: An instruction-following language model for graph computational problems. In KDD

  10. [10]

    Chen, S.; He, Y.; Cui, W.; Fan, J.; Ge, S.; Zhang, H.; Zhang, D.; and Chaudhuri, S. 2024 b . Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations. Proceedings of the ACM on Management of Data, 2(3): 1--27

  11. [11]

    Chen, S.; Tang, N.; Fan, J.; Yan, X.; Chai, C.; Li, G.; and Du, X. 2023. Haipipe: Combining human-generated and machine-generated pipelines for data preparation. Proceedings of the ACM on Management of Data, 1(1): 1--26

  12. [12]

    Dubey, A.; Jauhri, A.; Pandey, A.; Kadian, A.; Al-Dahle, A.; Letman, A.; Mathur, A.; Schelten, A.; Yang, A.; Fan, A.; et al. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783

  13. [13]

    Edge, D.; Trinh, H.; Cheng, N.; Bradley, J.; Chao, A.; Mody, A.; Truitt, S.; and Larson, J. 2024. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130

  14. [14]

    Fan, W.; Ding, Y.; Ning, L.; Wang, S.; Li, H.; Yin, D.; Chua, T.-S.; and Li, Q. 2024. A survey on rag meeting llms: Towards retrieval-augmented large language models. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 6491--6501

  15. [15]

    Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; and Wang, H. 2023. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997

  16. [16]

    Ghimire, A.; Prather, J.; and Edwards, J. 2024. Generative AI in Education: A Study of Educators' Awareness, Sentiments, and Influencing Factors. arXiv preprint arXiv:2403.15586

  17. [17]

    Grover, A.; and Leskovec, J. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 855--864

  18. [18]

    Guo, Z.; Xia, L.; Yu, Y.; Ao, T.; and Huang, C. 2024. LightRAG: Simple and Fast Retrieval-Augmented Generation. arXiv e-prints, arXiv--2410

  19. [19]

    J.; Shu, Y.; Gu, Y.; Yasunaga, M.; and Su, Y

    Guti \'e rrez, B. J.; Shu, Y.; Gu, Y.; Yasunaga, M.; and Su, Y. 2024. HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. arXiv preprint arXiv:2405.14831

  20. [20]

    Y.; Min, B.; and Castelli, V

    Han, R.; Zhang, Y.; Qi, P.; Xu, Y.; Wang, J.; Liu, L.; Wang, W. Y.; Min, B.; and Castelli, V. 2024. RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 4354--4374

  21. [21]

    G-retriever: Retrieval-augmented generation for textual graph understanding and question answering,

    He, X.; Tian, Y.; Sun, Y.; Chawla, N. V.; Laurent, T.; LeCun, Y.; Bresson, X.; and Hooi, B. 2024. G-retriever: Retrieval-augmented generation for textual graph understanding and question answering. arXiv preprint arXiv:2402.07630

  22. [22]

    Hu, Y.; and Lu, Y. 2024. Rag and rau: A survey on retrieval-augmented language model in natural language processing. arXiv preprint arXiv:2404.19543

  23. [23]

    Hu, Z.; Xu, Y.; Yu, W.; Wang, S.; Yang, Z.; Zhu, C.; Chang, K.-W.; and Sun, Y. 2022. Empowering language models with knowledge graph reasoning for question answering. arXiv preprint arXiv:2211.08380

  24. [24]

    Huang, Y.; and Huang, J. 2024. A Survey on Retrieval-Augmented Text Generation for Large Language Models. arXiv preprint arXiv:2404.10981

  25. [25]

    Huang, Y.; Zhang, S.; and Xiao, X. 2025. KET-RAG: A Cost-Efficient Multi-Granular Indexing Framework for Graph-RAG. arXiv preprint arXiv:2502.09304

  26. [26]

    J.; and Park, J

    Jeong, S.; Baek, J.; Cho, S.; Hwang, S. J.; and Park, J. C. 2024. Adaptive-rag: Learning to adapt retrieval-augmented large language models through question complexity. arXiv preprint arXiv:2403.14403

  27. [27]

    X.; and Wen, J.-R

    Jiang, J.; Zhou, K.; Dong, Z.; Ye, K.; Zhao, W. X.; and Wen, J.-R. 2023. Structgpt: A general framework for large language model to reason over structured data. arXiv preprint arXiv:2305.09645

  28. [28]

    M.; Melis, G.; and Grefenstette, E

    Ko c isk \`y , T.; Schwarz, J.; Blunsom, P.; Dyer, C.; Hermann, K. M.; Melis, G.; and Grefenstette, E. 2018. The narrativeqa reading comprehension challenge. Transactions of the Association for Computational Linguistics, 6: 317--328

  29. [29]

    S.; Reid, M.; Matsuo, Y.; and Iwasawa, Y

    Kojima, T.; Gu, S. S.; Reid, M.; Matsuo, Y.; and Iwasawa, Y. 2022. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35: 22199--22213

  30. [30]

    Y.; Yun, S.; Lee, J.; Chacko, A.; Hou, B.; Duong-Tran, D.; Ding, Y.; et al

    Li, D.; Yang, S.; Tan, Z.; Baik, J. Y.; Yun, S.; Lee, J.; Chacko, A.; Hou, B.; Duong-Tran, D.; Ding, Y.; et al. 2024. DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature. arXiv preprint arXiv:2405.04819

  31. [31]

    Li, Y.; Wang, S.; Ding, H.; and Chen, H. 2023. Large language models in finance: A survey. In Proceedings of the fourth ACM international conference on AI in finance, 374--382

  32. [32]

    Li, Z.; Yuan, H.; Wang, H.; Cong, G.; and Bing, L. 2025. LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency. Proceedings of the VLDB Endowment, 1(18): 53--65

  33. [33]

    Liu, L.; Yang, X.; Lei, J.; Liu, X.; Shen, Y.; Zhang, Z.; Wei, P.; Gu, J.; Chu, Z.; Qin, Z.; et al. 2024 a . A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions. arXiv preprint arXiv:2406.03712

  34. [34]

    F.; Lin, K.; Hewitt, J.; Paranjape, A.; Bevilacqua, M.; Petroni, F.; and Liang, P

    Liu, N. F.; Lin, K.; Hewitt, J.; Paranjape, A.; Bevilacqua, M.; Petroni, F.; and Liang, P. 2024 b . Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics, 12: 157--173

  35. [35]

    Luo, L.; Li, Y.-F.; Haffari, G.; and Pan, S. 2023. Reasoning on graphs: Faithful and interpretable large language model reasoning. arXiv preprint arXiv:2310.01061

  36. [36]

    Ma, S.; Xu, C.; Jiang, X.; Li, M.; Qu, H.; Yang, C.; Mao, J.; and Guo, J. 2024. Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation. arXiv preprint arXiv:2407.10805

  37. [37]

    A.; and Yashunin, D

    Malkov, Y. A.; and Yashunin, D. A. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, 42(4): 824--836

  38. [38]

    Mallen, A.; Asai, A.; Zhong, V.; Das, R.; Khashabi, D.; and Hajishirzi, H. 2022. When not to trust language models: Investigating effectiveness of parametric and non-parametric memories. arXiv preprint arXiv:2212.10511

  39. [39]

    Mavromatis, C.; and Karypis, G. 2024. GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning. arXiv preprint arXiv:2405.20139

  40. [40]

    A.; Ahmad, M

    Naeem, Z. A.; Ahmad, M. S.; Eltabakh, M.; Ouzzani, M.; and Tang, N. 2024. RetClean: Retrieval-Based Data Cleaning Using LLMs and Data Lakes. Proceedings of the VLDB Endowment, 17(12): 4421--4424

  41. [41]

    Narayan, A.; Chami, I.; Orr, L.; and R \'e , C. 2022. Can Foundation Models Wrangle Your Data? Proceedings of the VLDB Endowment, 16(4): 738--746

  42. [42]

    M.; Poor, H

    Nie, Y.; Kong, Y.; Dong, X.; Mulvey, J. M.; Poor, H. V.; Wen, Q.; and Zohren, S. 2024. A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges. arXiv preprint arXiv:2406.11903

  43. [43]

    Nomic Embed: Training a Reproducible Long Context Text Embedder

    Nussbaum, Z.; Morris, J. X.; Duderstadt, B.; and Mulyar, A. 2024. Nomic Embed: Training a Reproducible Long Context Text Embedder. arXiv:2402.01613

  44. [44]

    Peng, B.; Zhu, Y.; Liu, Y.; Bo, X.; Shi, H.; Hong, C.; Zhang, Y.; and Tang, S. 2024. Graph retrieval-augmented generation: A survey. arXiv preprint arXiv:2408.08921

  45. [45]

    Qian, Y.; He, Y.; Zhu, R.; Huang, J.; Ma, Z.; Wang, H.; Wang, Y.; Sun, X.; Lian, D.; Ding, B.; et al. 2024. UniDM: A Unified Framework for Data Manipulation with Large Language Models. Proceedings of Machine Learning and Systems, 6: 465--482

  46. [46]

    E.; and Walker, S

    Robertson, S. E.; and Walker, S. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR’94: Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, organised by Dublin City University, 232--241. Springer

  47. [47]

    Ruan, Y.; Fuhry, D.; and Parthasarathy, S. 2013. Efficient community detection in large networks using content and links. In Proceedings of the 22nd international conference on World Wide Web, 1089--1098

  48. [48]

    Sarthi, P.; Abdullah, S.; Tuli, A.; Khanna, S.; Goldie, A.; and Manning, C. D. 2024. Raptor: Recursive abstractive processing for tree-organized retrieval. arXiv preprint arXiv:2401.18059

  49. [49]

    Schick, T.; Dwivedi-Yu, J.; Dess \` , R.; Raileanu, R.; Lomeli, M.; Hambro, E.; Zettlemoyer, L.; Cancedda, N.; and Scialom, T. 2024. Toolformer: Language models can teach themselves to use tools. Advances in Neural Information Processing Systems, 36

  50. [50]

    Siriwardhana, S.; Weerasekera, R.; Wen, E.; Kaluarachchi, T.; Rana, R.; and Nanayakkara, S. 2023. Improving the domain adaptation of retrieval augmented generation (RAG) models for open domain question answering. Transactions of the Association for Computational Linguistics, 11: 1--17

  51. [51]

    Sun, J.; Xu, C.; Tang, L.; Wang, S.; Lin, C.; Gong, Y.; Shum, H.-Y.; and Guo, J. 2023. Think-on-graph: Deep and responsible reasoning of large language model with knowledge graph. arXiv preprint arXiv:2307.07697

  52. [52]

    Sun, Z.; Zhou, X.; and Li, G. 2024. R-Bot: An LLM-based Query Rewrite System. arXiv preprint arXiv:2412.01661

  53. [53]

    Tang, J.; Zhang, Q.; Li, Y.; and Li, J. 2024. Grapharena: Benchmarking large language models on graph computational problems. arXiv preprint arXiv:2407.00379

  54. [54]

    Tang, Y.; and Yang, Y. 2024. Multihop-rag: Benchmarking retrieval-augmented generation for multi-hop queries. arXiv preprint arXiv:2401.15391

  55. [55]

    A.; Waltman, L.; and Van Eck, N

    Traag, V. A.; Waltman, L.; and Van Eck, N. J. 2019. From Louvain to Leiden: guaranteeing well-connected communities. Scientific reports, 9(1): 1--12

  56. [56]

    Von Luxburg, U. 2007. A tutorial on spectral clustering. Statistics and computing, 17: 395--416

  57. [57]

    Wang, J.; Fu, J.; Wang, R.; Song, L.; and Bian, J. 2025. PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation. arXiv preprint arXiv:2501.11551

  58. [58]

    Wang, J.; Ning, H.; Peng, Y.; Wei, Q.; Tesfai, D.; Mao, W.; Zhu, T.; and Huang, R. 2024 a . A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations. arXiv preprint arXiv:2406.10303

  59. [59]

    Wang, K.; Duan, F.; Wang, S.; Li, P.; Xian, Y.; Yin, C.; Rong, W.; and Xiong, Z. 2023. Knowledge-driven cot: Exploring faithful reasoning in llms for knowledge-intensive question answering. arXiv preprint arXiv:2308.13259

  60. [60]

    Large language models for education: A survey and outlook

    Wang, S.; Xu, T.; Li, H.; Zhang, C.; Liang, J.; Tang, J.; Yu, P. S.; and Wen, Q. 2024 b . Large language models for education: A survey and outlook. arXiv preprint arXiv:2403.18105

  61. [61]

    A.; Siu, A.; Zhang, R.; and Derr, T

    Wang, Y.; Lipka, N.; Rossi, R. A.; Siu, A.; Zhang, R.; and Derr, T. 2024 c . Knowledge graph prompting for multi-document question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 19206--19214

  62. [62]

    Wu, J.; Zhu, J.; Qi, Y.; Chen, J.; Xu, M.; Menolascina, F.; and Grau, V. 2024 a . Medical graph rag: Towards safe medical large language model via graph retrieval-augmented generation. arXiv preprint arXiv:2408.04187

  63. [63]

    Wu, S.; Xiong, Y.; Cui, Y.; Wu, H.; Chen, C.; Yuan, Y.; Huang, L.; Liu, X.; Kuo, T.-W.; Guan, N.; et al. 2024 b . Retrieval-augmented generation for natural language processing: A survey. arXiv preprint arXiv:2407.13193

  64. [64]

    Xu, S.; Pang, L.; Yu, M.; Meng, F.; Shen, H.; Cheng, X.; and Zhou, J. 2024. Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation. arXiv preprint arXiv:2402.18150

  65. [65]

    Xu, X.; Yuruk, N.; Feng, Z.; and Schweiger, T. A. 2007. Scan: a structural clustering algorithm for networks. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 824--833

  66. [66]

    HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

    Yang, Z.; Qi, P.; Zhang, S.; Bengio, Y.; Cohen, W. W.; Salakhutdinov, R.; and Manning, C. D. 2018. HotpotQA: A dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600

  67. [67]

    Yao, S.; Zhao, J.; Yu, D.; Du, N.; Shafran, I.; Narasimhan, K.; and Cao, Y. 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629

  68. [68]

    Yu, H.; Gan, A.; Zhang, K.; Tong, S.; Liu, Q.; and Liu, Z. 2024. Evaluation of Retrieval-Augmented Generation: A Survey. arXiv preprint arXiv:2405.07437

  69. [69]

    K.; Fabbri, A.; Bernadett-Shapiro, G.; Zhang, R.; Mitra, P.; Xiong, C.; and Wu, C.-S

    Zhang, N.; Choubey, P. K.; Fabbri, A.; Bernadett-Shapiro, G.; Zhang, R.; Mitra, P.; Xiong, C.; and Wu, C.-S. 2024 a . SiReRAG: Indexing Similar and Related Information for Multihop Reasoning. arXiv preprint arXiv:2412.06206

  70. [70]

    Zhang, Q.; Hong, X.; Tang, J.; Chen, N.; Li, Y.; Li, W.; Tang, J.; and Li, J. 2024 b . Gcoder: Improving large language model for generalized graph problem solving. arXiv preprint arXiv:2410.19084

  71. [71]

    Zhao, P.; Zhang, H.; Yu, Q.; Wang, Z.; Geng, Y.; Fu, F.; Yang, L.; Zhang, W.; and Cui, B. 2024. Retrieval-augmented generation for ai-generated content: A survey. arXiv preprint arXiv:2402.19473

  72. [72]

    Zheng, Y.; Gan, W.; Chen, Z.; Qi, Z.; Liang, Q.; and Yu, P. S. 2024. Large language models for medicine: a survey. International Journal of Machine Learning and Cybernetics, 1--26

  73. [73]

    Zhou, Y.; Cheng, H.; and Yu, J. X. 2009. Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1): 718--729

  74. [74]

    Zhu, Y.; Wang, X.; Chen, J.; Qiao, S.; Ou, Y.; Yao, Y.; Deng, S.; Chen, H.; and Zhang, N. 2024. Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities. World Wide Web, 27(5): 58