pith. machine review for the scientific record. sign in

arxiv: 2604.20844 · v1 · submitted 2026-02-10 · 💻 cs.IR · cs.AI

Recognition: 1 theorem link

· Lean Theorem

AtomicRAG: Atom-Entity Graphs for Retrieval-Augmented Generation

Authors on Pith no claims yet

Pith reviewed 2026-05-16 03:34 UTC · model grok-4.3

classification 💻 cs.IR cs.AI
keywords Retrieval-Augmented GenerationGraphRAGAtomic FactsKnowledge GraphsPersonalized PageRankInformation RetrievalEntity Linking
0
0 comments X

The pith

Knowledge broken into atomic facts and linked by simple existence edges in a graph improves retrieval accuracy and robustness over chunk-based RAG methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that current GraphRAG approaches suffer because they index coarse text chunks that bundle multiple facts together and rely on error-prone relation triples for connections. Instead, it stores knowledge as individual atomic facts, each a self-contained unit, and connects entities only by the existence of a link rather than specific relations. Personalized PageRank combined with relevance filtering then extracts reliable paths for a given query. If correct, this representation lets the system flexibly reassemble facts to match different query needs without the rigidity of chunks or the fragility of extracted triples. The claim is backed by theoretical analysis and experiments across five public benchmarks showing gains in both retrieval accuracy and downstream reasoning.

Core claim

The Atom-Entity Graph stores knowledge as discrete atomic facts rather than text chunks and uses edges that only record whether a relationship exists between entities. Personalized PageRank run on this graph, followed by relevance-based filtering, produces more accurate and complete retrieval sets for generation. Because atoms can be combined without interference from unrelated facts inside the same chunk, the method supports diverse query perspectives while avoiding propagation of relation-extraction mistakes that break reasoning paths in triple-based graphs.

What carries the argument

The Atom-Entity Graph, in which each node holds one self-contained factual atom and each edge simply marks the existence of a connection, processed by personalized PageRank plus relevance filtering to select query-aligned paths.

If this is right

  • Retrieval can adapt to queries that require only a subset of facts from what was originally one chunk without losing precision.
  • Reasoning paths remain intact even when relation extraction would have produced a wrong triple.
  • Knowledge elements can be added or removed at atom granularity without rewriting entire chunks or rebuilding large parts of the graph.
  • The same index supports multiple downstream tasks that each need different combinations of facts from the source material.
  • Overall accuracy and robustness improve on standard RAG benchmarks when the atom-entity structure replaces chunk-triple graphs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method may make it easier to update the knowledge base incrementally, since only affected atoms need re-indexing.
  • It could reduce dependence on high-quality relation extraction models, shifting effort toward accurate atomic fact segmentation.
  • Similar atom-level decomposition might benefit other graph retrieval settings outside RAG, such as multi-hop question answering over documents.
  • If atom extraction quality is high, the approach could extend naturally to multimodal sources where each modality contributes separate atomic units.

Load-bearing premise

Decomposing text into individual atomic facts and connecting entities only by existence edges will preserve all necessary context and avoid introducing new extraction errors that offset the gains in flexibility.

What would settle it

Replace the atom decomposition step with either full original chunks or randomly split sentences while keeping the same graph construction and PageRank procedure; if retrieval accuracy and reasoning scores on the five benchmarks no longer exceed the chunk-based baselines, the advantage of atomic units is falsified.

Figures

Figures reproduced from arXiv: 2604.20844 by Duanyang Yuan, Jian Huang, Ke Liang, Sihang Zhou, Siwei Wang, Xiaoshu Chen, Xinwang Liu, Yanning Hou.

Figure 1
Figure 1. Figure 1: Comparison of knowledge representation and indexing for three classes of methods. Native RAG uses coarse text chunks as basic storage units and indexes them via semantic similarity. GraphRAG organizes knowledge with triples or chunk-level nodes, building connections through relation edges to facilitate global indexing. The proposed Atom–Entity Graph instead represents the corpus with fine-grained knowledge… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of AtomicRAG. During the preprocessing phase, we construct an unlabeled Atom–Entity Graph (AEG) that atomizes the corpus into minimal knowledge atoms linked via entities and co-occurrence relationships. Specifically, as illustrated in the figure, our co-occurrence relationships fall into three types: containment, relevance, and synonymy. At retrieval time, a complex query is optionally decomposed … view at source ↗
Figure 3
Figure 3. Figure 3: Semantic utility. LLM-based assessment of 1-hop graph neighborhoods on Graph-Bench (Medical) with respect to correct￾ness, relevance, consistency, redundancy, and comprehensiveness. 4.4. Graph Quality Analysis (RQ3) We next examine the quality of the constructed graphs from both structural connectivity and semantic utility. Structural connectivity [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Impact of the retrieval Top-k hyperparameter on answer accuracy and token length: Top-k specifies how many knowledge atoms AtomicRAG retrieves per query, and token length is the total number of tokens in the LLM input formed by the question and the retrieved atoms [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Accuracy under limited context lengths : each point is evaluated with a fixed context budget, defined as the maximum number of tokens permitted in the LLM input, and all methods are truncated to this budget before generation. We evaluate efficiency in terms of (i) the accuracy–token trade-off as Top-k varies, (ii) robustness under fixed context budgets, and (iii) per-query retrieval latency. Effect of Top-… view at source ↗
Figure 6
Figure 6. Figure 6: Case study. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Prompt template for named entity recognition (NER). System Prompt: ``` Your task is to extract both RDF triples and knowledge fragments from the given passage in a single unified process. Requirements for RDF triples: - Each triple should contain at least one, but preferably two, of the named entities from the provided list - Follow the format: [subject, predicate, object - Clearly resolve pronouns to thei… view at source ↗
Figure 8
Figure 8. Figure 8: Prompt template for unified triple and knowledge atom extraction. 24 [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Prompt template for question complexity scoring and atomic decomposition. 25 [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Prompt template for knowledge atom filtering. 26 [PITH_FULL_IMAGE:figures/full_fig_p026_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Prompt template for abstract question answering. System Prompt: ``` As an advanced reading comprehension assistant, your task is to analyze text passages and corresponding questions meticulously. Your response start after "Thought: ", where you will methodically break down the reasoning process, illustrating how you arrive at conclusions. Conclude with "Answer: " to present a concise, definitive response,… view at source ↗
Figure 12
Figure 12. Figure 12: Prompt template for precise question answering. 27 [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗
read the original abstract

Recent GraphRAG methods integrate graph structures into text indexing and retrieval, using knowledge graph triples to connect text chunks, thereby improving retrieval coverage and precision. However, we observe that treating text chunks as the basic unit of knowledge representation rigidly groups multiple atomic facts together, limiting the flexibility and adaptability needed to support diverse retrieval scenarios. Additionally, triple-based entity linking is sensitive to relation-extraction errors, which can lead to missing or incorrect reasoning paths and ultimately hurt retrieval accuracy. To address these issues, we propose the Atom-Entity Graph, a more precise and reliable architecture for knowledge representation and indexing. In our approach, knowledge is stored as knowledge atoms, namely individual, self-contained units of factual information, rather than coarse-grained text chunks. This allows knowledge elements to be flexibly reassembled without mutual interference, thereby enabling seamless alignment with diverse query perspectives. Edges between entities simply indicate whether a relationship exists. By combining personalized PageRank with relevance-based filtering, we maintain accurate entity connections and improve the reliability of reasoning. Theoretical analysis and experiments on five public benchmarks show that the proposed AtomicRAG algorithm outperforms strong RAG baselines in retrieval accuracy and reasoning robustness. Code: https://github.com/7HHHHH/AtomicRAG.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes AtomicRAG, which represents knowledge as fine-grained, self-contained 'knowledge atoms' (individual factual units) linked by simple existence-only edges between entities, rather than coarse text chunks or relation triples. Retrieval combines personalized PageRank with relevance-based filtering to produce more flexible and accurate results. The central claim is that this architecture improves retrieval accuracy and reasoning robustness over strong RAG baselines, supported by theoretical analysis and experiments on five public benchmarks.

Significance. If the empirical gains and theoretical arguments hold under scrutiny, the work could meaningfully advance GraphRAG by reducing relation-extraction errors and chunk rigidity, enabling more adaptable retrieval for diverse queries. The public code release supports reproducibility and potential adoption.

major comments (3)
  1. [§3] §3 (Method): The atom extraction procedure is described at a high level but lacks a concrete algorithm, prompt template, or validation metric for atom completeness. Without this, it is impossible to assess whether the claimed flexibility gains come at the cost of omitted qualifiers or merged facts, directly affecting the central accuracy claim.
  2. [§4] §4 (Experiments): The results on the five benchmarks report outperformance but provide no details on baseline implementations, hyperparameter controls, statistical significance, or variance across runs. An ablation isolating the contribution of relevance filtering versus PPR is also missing, leaving open whether the gains are robust or artifactual.
  3. [Theoretical Analysis] Theoretical Analysis (referenced in abstract and §5): The analysis is invoked to explain why existence edges plus PPR improve reasoning paths, yet no key lemmas, assumptions, or proof sketches appear. This weakens the ability to evaluate whether the architecture genuinely mitigates the triple-error and chunk-rigidity problems identified in the introduction.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'strong RAG baselines' should explicitly name the compared methods (e.g., standard GraphRAG, HippoRAG) to allow immediate context for the claimed gains.
  2. [§3.2] Figure 2 or §3.2: The atom-entity graph visualization would benefit from an example showing how a multi-fact sentence is split into atoms and re-linked, to clarify context preservation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's thorough review and valuable suggestions. We agree with the points raised and will make the necessary revisions to enhance the manuscript's clarity, reproducibility, and rigor.

read point-by-point responses
  1. Referee: [§3] §3 (Method): The atom extraction procedure is described at a high level but lacks a concrete algorithm, prompt template, or validation metric for atom completeness. Without this, it is impossible to assess whether the claimed flexibility gains come at the cost of omitted qualifiers or merged facts, directly affecting the central accuracy claim.

    Authors: We agree that additional details are needed for the atom extraction procedure. In the revised version, we will provide the complete algorithm in pseudocode, the full prompt template used for extracting knowledge atoms from text, and a validation approach involving manual inspection of atom quality on sampled documents to ensure completeness and avoid merging or omitting facts. This will allow readers to better evaluate the flexibility gains. revision: yes

  2. Referee: [§4] §4 (Experiments): The results on the five benchmarks report outperformance but provide no details on baseline implementations, hyperparameter controls, statistical significance, or variance across runs. An ablation isolating the contribution of relevance filtering versus PPR is also missing, leaving open whether the gains are robust or artifactual.

    Authors: We acknowledge the lack of experimental details. The revised manuscript will include full specifications of baseline implementations (with references to their original papers and our re-implementations), all hyperparameter values and tuning procedures, statistical significance tests (e.g., paired t-tests with p-values), and standard deviations from multiple runs. Additionally, we will add an ablation study that isolates the effects of relevance filtering and personalized PageRank to demonstrate the contribution of each component. revision: yes

  3. Referee: [Theoretical Analysis] Theoretical Analysis (referenced in abstract and §5): The analysis is invoked to explain why existence edges plus PPR improve reasoning paths, yet no key lemmas, assumptions, or proof sketches appear. This weakens the ability to evaluate whether the architecture genuinely mitigates the triple-error and chunk-rigidity problems identified in the introduction.

    Authors: The theoretical analysis section will be expanded in the revision. We will include explicit assumptions (such as the independence of atomic facts and the connectivity properties of existence-only edges), a key lemma regarding reduced error propagation in reasoning paths compared to triple-based graphs, and a proof sketch demonstrating how personalized PageRank enhances path reliability. This will directly address how the architecture mitigates the issues of relation-extraction errors and chunk rigidity. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on external benchmarks and independent theoretical analysis

full rationale

The paper defines AtomicRAG via atom-entity graphs with existence-only edges, personalized PageRank, and relevance filtering, then supports its superiority through theoretical analysis plus direct empirical comparison on five public benchmarks. No equations, fitted parameters, or predictions are presented that reduce by construction to the input data or to self-citations; the atom extraction and edge simplification steps are architectural choices whose performance is measured externally rather than assumed. Self-citations, if present, are not load-bearing for the central accuracy claim. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

The central claim rests on the unstated premise that reliable extraction of atomic facts is feasible and that simple existence edges plus PageRank filtering suffice for robust reasoning paths; no free parameters or invented entities beyond the new graph structure are described in the abstract.

invented entities (1)
  • knowledge atom no independent evidence
    purpose: self-contained unit of factual information for flexible reassembly
    Introduced as the basic representation unit replacing text chunks

pith-pipeline@v0.9.0 · 5535 in / 1092 out tokens · 193488 ms · 2026-05-16T03:34:55.049416+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · 9 internal anchors

  1. [1]

    Agarwal, O. S. et al. gpt-oss-120b&gpt-oss-20b model card. 2025

  2. [2]

    Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

    Asai, A., Wu, Z., Wang, Y., Sil, A., and Hajishirzi, H. Self-rag: Learning to retrieve, generate, and critique through self-reflection. ArXiv, abs/2310.11511, 2023

  3. [3]

    You don't need pre-built graphs for rag: Retrieval augmented generation with adaptive reasoning structures

    Chen, S., Zhou, C., Yuan, Z., Zhang, Q., Cui, Z., Chen, H., Xiao, Y., Cao, J., and Huang, X. You don't need pre-built graphs for rag: Retrieval augmented generation with adaptive reasoning structures. ArXiv, abs/2508.06105, 2025

  4. [4]

    Dense x retrieval: What retrieval granularity should we use? In Conference on Empirical Methods in Natural Language Processing, 2023

    Chen, T., Wang, H., Chen, S., Yu, W., Ma, K., Zhao, X., Yu, D., and Zhang, H. Dense x retrieval: What retrieval granularity should we use? In Conference on Empirical Methods in Natural Language Processing, 2023

  5. [5]

    Fastgraphrag: High-speed graph-based retrieval-augmented generation

    CircleMind-AI. Fastgraphrag: High-speed graph-based retrieval-augmented generation. CircleMind-AI Blog, 2024

  6. [6]

    Darren Edge, Ha Trinh, J. L. Lazygraphrag: Setting a new standard for quality and cost. Microsoft Blog, 2024

  7. [7]

    From Local to Global: A Graph RAG Approach to Query-Focused Summarization

    Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody, A. N., Truitt, S., and Larson, J. From local to global: A graph rag approach to query-focused summarization. ArXiv, abs/2404.16130, 2024

  8. [8]

    J., and Callan, J

    Gao, L., Ma, X., Lin, J. J., and Callan, J. Precise zero-shot dense retrieval without relevance labels. In Annual Meeting of the Association for Computational Linguistics, 2022

  9. [9]

    Retrieval-Augmented Generation for Large Language Models: A Survey

    Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Guo, Q., Wang, M., and Wang, H. Retrieval-augmented generation for large language models: A survey. ArXiv, abs/2312.10997, 2023

  10. [10]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025 a

  11. [11]

    Routerag: Efficient retrieval-augmented generation from text and graph via reinforcement learning

    Guo, Y., Su, M., Guan, S., Sun, Z., Jin, X., Guo, J., and Cheng, X. Routerag: Efficient retrieval-augmented generation from text and graph via reinforcement learning. 2025 b

  12. [12]

    LightRAG: Simple and Fast Retrieval-Augmented Generation

    Guo, Z., Xia, L., Yu, Y., Ao, T., and Huang, C. Lightrag: Simple and fast retrieval-augmented generation. ArXiv, abs/2410.05779, 2024

  13. [13]

    J., Shu, Y., Gu, Y., Yasunaga, M., and Su, Y

    Gutierrez, B. J., Shu, Y., Gu, Y., Yasunaga, M., and Su, Y. Hipporag: Neurobiologically inspired long-term memory for large language models. ArXiv, abs/2405.14831, 2024

  14. [14]

    J., Shu, Y., Qi, W., Zhou, S., and Su, Y

    Guti'errez, B. J., Shu, Y., Qi, W., Zhou, S., and Su, Y. From rag to memory: Non-parametric continual learning for large language models. ArXiv, abs/2502.14802, 2025

  15. [15]

    Con- structing a multi-hop qa dataset for comprehensive evalu- ation of reasoning steps.ArXiv, abs/2011.01060,

    Ho, X., Nguyen, A., Sugawara, S., and Aizawa, A. Constructing a multi-hop qa dataset for comprehensive evaluation of reasoning steps. ArXiv, abs/2011.01060, 2020

  16. [16]

    Soft reasoning paths for knowledge graph completion

    Hou, Y., Zhou, S., Liang, K., Meng, L., Chen, X., Xu, K., Wang, S., Liu, X., and Huang, J. Soft reasoning paths for knowledge graph completion. In International Joint Conference on Artificial Intelligence, 2025

  17. [17]

    Grag: Graph retrieval-augmented generation

    Hu, Y., Lei, Z., Zhang, Z., Pan, B., Ling, C., and Zhao, L. Grag: Graph retrieval-augmented generation. ArXiv, abs/2405.16506, 2024

  18. [18]

    Ket-rag: A cost-efficient multi-granular indexing framework for graph-rag

    Huang, Y., Zhang, S., and Xiao, X. Ket-rag: A cost-efficient multi-granular indexing framework for graph-rag. arXiv preprint arXiv:2502.09304, 2025

  19. [19]

    Leveraging passage retrieval with generative models for open domain question answering

    Izacard, G. and Grave, E. Leveraging passage retrieval with generative models for open domain question answering. ArXiv, abs/2007.01282, 2020

  20. [20]

    Llmlingua: Compressing prompts for accelerated inference of large language models

    Jiang, H., Wu, Q., Lin, C.-Y., Yang, Y., and Qiu, L. Llmlingua: Compressing prompts for accelerated inference of large language models. In Conference on Empirical Methods in Natural Language Processing, 2023

  21. [21]

    Dense Passage Retrieval for Open-Domain Question Answering

    Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L. Y., Edunov, S., Chen, D., and tau Yih, W. Dense passage retrieval for open-domain question answering. ArXiv, abs/2004.04906, 2020

  22. [22]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Kuttler, H., Lewis, M., tau Yih, W., Rockt \"a schel, T., Riedel, S., and Kiela, D. Retrieval-augmented generation for knowledge-intensive nlp tasks. ArXiv, abs/2005.11401, 2020

  23. [23]

    Structrag: Boosting knowledge intensive reasoning of llms via inference-time hybrid information structurization

    Li, Z., Chen, X., Yu, H., Lin, H., Lu, Y., Tang, Q., Huang, F., Han, X., Sun, L., and Li, Y. Structrag: Boosting knowledge intensive reasoning of llms via inference-time hybrid information structurization. ArXiv, abs/2410.08815, 2024

  24. [24]

    Hy- pergraphrag: Retrieval-augmented generation via hypergraph-structured knowledge representation.arXiv preprint arXiv:2503.21322,

    Luo, H., Chen, G., Zheng, Y., Wu, X., Guo, Y., Lin, Q., Feng, Y., Kuang, Z., Song, M., Zhu, Y., et al. Hypergraphrag: Retrieval-augmented generation via hypergraph-structured knowledge representation. arXiv preprint arXiv:2503.21322, 2025 a

  25. [25]

    Luo, H., Haihong, E., Chen, G., Lin, Q., Guo, Y., Xu, F., min Kuang, Z., Song, M., Wu, X., Zhu, Y., and Luu, A. T. Graph-r1: Towards agentic graphrag framework via end-to-end reinforcement learning. ArXiv, abs/2507.21892, 2025 b

  26. [26]

    Gfm-rag: Graph foundation model for retrieval augmented generation

    Luo, L., Zhao, Z., Haffari, G., Phung, D., Gong, C., and Pan, S. Gfm-rag: Graph foundation model for retrieval augmented generation. ArXiv, abs/2502.01113, 2025 c

  27. [27]

    and Karypis, G

    Mavromatis, C. and Karypis, G. Gnn-rag: Graph neural retrieval for efficient large language model reasoning on knowledge graphs. In Annual Meeting of the Association for Computational Linguistics, 2025

  28. [28]

    Graph retrieval-augmented generation: A survey

    Peng, B., Zhu, Y., Liu, Y., Bo, X., Shi, H., Hong, C., Zhang, Y., and Tang, S. Graph retrieval-augmented generation: A survey. ACM Transactions on Information Systems, 2024

  29. [29]

    Memorag: Moving to- wards next-gen rag via memory-inspired knowledge dis- covery.arXiv preprint arXiv:2409.05591,

    Qian, H., Zhang, P., Liu, Z., Mao, K., and Dou, Z. Memorag: Moving towards next-gen rag via memory-inspired knowledge discovery. arXiv preprint arXiv:2409.05591, 2024

  30. [30]

    Sarthi, P., Abdullah, S., Tuli, A., Khanna, S., Goldie, A., and Manning, C. D. Raptor: Recursive abstractive processing for tree-organized retrieval. ArXiv, abs/2401.18059, 2024

  31. [31]

    Enhancing retrieval-augmented large language models with iterative retrieval-generation synergy

    Shao, Z., Gong, Y., Shen, Y., Huang, M., Duan, N., and Chen, W. Enhancing retrieval-augmented large language models with iterative retrieval-generation synergy. ArXiv, abs/2305.15294, 2023

  32. [32]

    ♫ musique: Multihop questions via single-hop question composition

    Trivedi, H., Balasubramanian, N., Khot, T., and Sabharwal, A. ♫ musique: Multihop questions via single-hop question composition. Transactions of the Association for Computational Linguistics, 10: 0 539--554, 2021

  33. [33]

    Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step questions

    Trivedi, H., Balasubramanian, N., Khot, T., and Sabharwal, A. Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step questions. ArXiv, abs/2212.10509, 2022

  34. [34]

    Research on the construction and application of retrieval enhanced generation (rag) model based on knowledge graph

    Wang, S., Yang, H., and Liu, W. Research on the construction and application of retrieval enhanced generation (rag) model based on knowledge graph. Scientific Reports, 15, 2025

  35. [35]

    A., Siu, A

    Wang, Y., Lipka, N., Rossi, R. A., Siu, A. F., Zhang, R., and Derr, T. Knowledge graph prompting for multi-document question answering. In AAAI Conference on Artificial Intelligence, 2023

  36. [36]

    When to use graphs in rag: A comprehensive analysis for graph retrieval-augmented generation

    Xiang, Z., Wu, C., Zhang, Q., Chen, S., Hong, Z., Huang, X., and Su, J. When to use graphs in rag: A comprehensive analysis for graph retrieval-augmented generation. ArXiv, abs/2506.05690, 2025

  37. [37]

    L., Iyer, S., Du, J., Lewis, P., Wang, W

    Xiong, W., Li, X. L., Iyer, S., Du, J., Lewis, P., Wang, W. Y., Mehdad, Y., tau Yih, W., Riedel, S., Kiela, D., and Oğuz, B. Answering complex open-domain questions with multi-hop dense retrieval. ArXiv, abs/2009.12756, 2020

  38. [38]

    Qwen2.5 Technical Report

    Yang, Q. A., Yang, B., Zhang, B., et al. Qwen2.5 technical report. ArXiv, abs/2412.15115, 2024

  39. [39]

    W., Salakhutdinov, R., and Manning, C

    Yang, Z., Qi, P., Zhang, S., Bengio, Y., Cohen, W. W., Salakhutdinov, R., and Manning, C. D. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. 2018

  40. [40]

    A survey of graph retrieval-augmented generation for customized large language models

    Zhang, Q., Chen, S., Bei, Y.-Q., Yuan, Z., Zhou, H., Hong, Z., Dong, J., Chen, H., Chang, Y., and Huang, X. A survey of graph retrieval-augmented generation for customized large language models. ArXiv, abs/2501.13958, 2025

  41. [41]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...