pith. machine review for the scientific record. sign in

arxiv: 2604.12610 · v1 · submitted 2026-04-14 · 💻 cs.CL

Recognition: unknown

Transforming External Knowledge into Triplets for Enhanced Retrieval in RAG of LLMs

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:40 UTC · model grok-4.3

classification 💻 cs.CL
keywords Retrieval-Augmented GenerationLarge Language ModelsKnowledge TripletsStructured RetrievalPrompt-based AdaptationContext Efficiency
0
0 comments X

The pith

Tri-RAG converts external knowledge into Condition-Proof-Conclusion triplets to raise retrieval precision while lowering token costs in LLM generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Current RAG systems pull raw text passages and paste them into the context, which often adds irrelevant material, inflates token counts, and breaks logical flow during reasoning. The paper shows that a lightweight prompt can turn any natural-language knowledge base into fixed triplets of Condition, Proof, and Conclusion without retraining the model. The Condition phrase then serves as the sole retrieval key, so the system matches queries only to the relevant logical unit rather than whole paragraphs. If the transformation preserves the original relations, downstream generation should become both more accurate and cheaper in tokens. Readers would care because this directly attacks the practical limits of context length and hallucination control in real deployments.

Core claim

Tri-RAG automatically transforms external knowledge from natural language into standardized structured triplets consisting of Condition, Proof, and Conclusion, explicitly capturing logical relations among knowledge fragments using lightweight prompt-based adaptation with frozen model parameters. The triplet head Condition is treated as an explicit semantic anchor for retrieval and matching, enabling precise identification of query-relevant knowledge units without directly concatenating lengthy raw texts.

What carries the argument

The Condition-Proof-Conclusion triplet, in which the leading Condition phrase functions as the semantic anchor for query matching.

If this is right

  • Retrieval precision rises because only the matching Condition is fetched rather than whole passages.
  • Token consumption drops while semantic alignment between query and evidence improves.
  • Reasoning chains stay intact because the Proof and Conclusion remain explicitly linked to each Condition.
  • Generation stability increases in complex tasks as redundant context is removed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same triplet format could be applied to non-RAG tasks such as multi-document summarization or knowledge-base question answering to enforce explicit logical structure.
  • A direct test would compare end-to-end accuracy on datasets containing conflicting or ambiguous knowledge fragments to see whether the forced triplet format surfaces inconsistencies that raw-text RAG hides.
  • If the prompt occasionally produces malformed triplets on out-of-domain text, a lightweight verification step that checks for missing Proof or Conclusion fields could be added without changing the frozen model.

Load-bearing premise

That lightweight prompt-based adaptation with frozen model parameters can reliably and consistently transform arbitrary natural language knowledge into standardized Condition-Proof-Conclusion triplets that capture logical relations without introducing errors or losing information.

What would settle it

Measure the logical fidelity of the generated triplets against human-annotated ground-truth relations on a multi-hop reasoning benchmark; if the fraction of triplets that drop, invert, or add incorrect logical links exceeds the improvement margin shown by Tri-RAG over baseline RAG, the performance gains disappear.

Figures

Figures reproduced from arXiv: 2604.12610 by Caiyan Qin, Chang Lu, Chaoning Zhang, Hengtao Shen, Qigan Sun, Sheng Zheng, Xudong Wang, Yang Yang, Zeyu Ma, Zhenzhen Huang.

Figure 1
Figure 1. Figure 1: Triplet Matching for Efficient Reasoning. The input query is [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Soft Prompt–Driven Triplet Structuring and Retrieval-Augmented Inference. The upper part shows how soft prompts guide frozen LLMs to extract [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Retrieval-key ablation comparing ALL retrieval using the concatena￾tion [Tc; Tp; Tr] and single-field retrieval using TC, TP, or TR. We report Hit@1 together with index-level retrieval latency and memory footprint. is evaluated once, and the “±” in Table V denotes the stan￾dard deviation across evaluated instances. For each evaluation instance, the judge receives the source passage, the extracted triplet s… view at source ↗
read the original abstract

Retrieval-Augmented Generation (RAG) mitigates hallucination in large language models (LLMs) by incorporating external knowledge during generation. However, the effectiveness of RAG depends not only on the design of the retriever and the capacity of the underlying model, but also on how retrieved evidence is structured and aligned with the query. Existing RAG approaches typically retrieve and concatenate unstructured text fragments as context, which often introduces redundant or weakly relevant information. This practice leads to excessive context accumulation, reduced semantic alignment, and fragmented reasoning chains, thereby degrading generation quality while increasing token consumption. To address these challenges, we propose Tri-RAG, a structured triplet-based retrieval framework that improves retrieval efficiency through reasoning-aligned context construction. Tri-RAG automatically transforms external knowledge from natural language into standardized structured triplets consisting of Condition, Proof, and Conclusion, explicitly capturing logical relations among knowledge fragments using lightweight prompt-based adaptation with frozen model parameters. Building on this representation, the triplet head Condition is treated as an explicit semantic anchor for retrieval and matching, enabling precise identification of query-relevant knowledge units without directly concatenating lengthy raw texts. As a result, Tri-RAG achieves a favorable balance between retrieval accuracy and context token efficiency. Experimental results across multiple benchmark datasets demonstrate that Tri-RAG significantly improves retrieval quality and reasoning efficiency, while producing more stable generation behavior and more efficient resource utilization in complex reasoning scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes Tri-RAG, a structured retrieval framework for RAG in LLMs. External knowledge is transformed into standardized Condition-Proof-Conclusion triplets via lightweight prompt-based adaptation on a frozen LLM. The Condition component serves as an explicit semantic anchor for retrieval and matching, avoiding direct concatenation of raw text fragments. This is claimed to improve retrieval precision, reduce token consumption, stabilize generation, and enhance reasoning efficiency, with experimental results across multiple benchmarks demonstrating significant gains over standard RAG approaches.

Significance. If the triplet extraction reliably preserves logical structure and the empirical claims hold, Tri-RAG would represent a practical engineering advance in RAG by replacing unstructured context with logically anchored units, potentially lowering context length while improving semantic alignment and reasoning stability. The use of frozen-model prompt adaptation keeps the method lightweight and broadly applicable.

major comments (2)
  1. [Abstract] Abstract: the assertion that 'Experimental results across multiple benchmark datasets demonstrate that Tri-RAG significantly improves retrieval quality and reasoning efficiency' supplies no metrics, baselines, dataset names, ablation results, or error bars. Without these, the central empirical claim cannot be evaluated and the reported balance between accuracy and token efficiency remains unverified.
  2. [Method] Method description (triplet transformation): the load-bearing step of converting arbitrary natural-language knowledge into Condition-Proof-Conclusion triplets via a single prompt on a frozen model is presented without any quantitative fidelity analysis, human-annotated comparison, prompt-sensitivity ablation, or error characterization for complex conditionals and implicit premises. This directly undermines the downstream claims of precise Condition-based retrieval and reduced token use.
minor comments (2)
  1. The manuscript would benefit from an explicit diagram or pseudocode showing the end-to-end flow from knowledge ingestion through triplet extraction, Condition-based retrieval, and generation.
  2. Notation for the triplet fields (Condition, Proof, Conclusion) should be defined once with an example in the main text rather than only in the abstract.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript accordingly to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that 'Experimental results across multiple benchmark datasets demonstrate that Tri-RAG significantly improves retrieval quality and reasoning efficiency' supplies no metrics, baselines, dataset names, ablation results, or error bars. Without these, the central empirical claim cannot be evaluated and the reported balance between accuracy and token efficiency remains unverified.

    Authors: We agree that the abstract would be more informative with concrete details. The full manuscript reports these elements in Section 4 (Experiments), including specific metrics, baselines such as vanilla RAG, dataset names, and ablation studies. We will revise the abstract to incorporate key quantitative results, such as retrieval accuracy gains and token reductions, along with the main datasets and baselines used. revision: yes

  2. Referee: [Method] Method description (triplet transformation): the load-bearing step of converting arbitrary natural-language knowledge into Condition-Proof-Conclusion triplets via a single prompt on a frozen model is presented without any quantitative fidelity analysis, human-annotated comparison, prompt-sensitivity ablation, or error characterization for complex conditionals and implicit premises. This directly undermines the downstream claims of precise Condition-based retrieval and reduced token use.

    Authors: We acknowledge that the manuscript does not currently include a dedicated quantitative analysis of triplet extraction fidelity. We will add a new subsection with human evaluation results, prompt-sensitivity ablations, and error analysis on complex cases to quantify the reliability of the Condition-Proof-Conclusion transformation and better support the retrieval and efficiency claims. revision: yes

Circularity Check

0 steps flagged

No circularity; independent engineering proposal without self-referential derivation

full rationale

The paper introduces Tri-RAG as a practical framework that applies lightweight prompt-based adaptation on a frozen LLM to convert external natural-language knowledge into Condition-Proof-Conclusion triplets for structured retrieval. No equations, fitted parameters, or mathematical derivations appear in the provided text. The central mechanism is presented as a novel engineering choice rather than a result that reduces to prior inputs by construction. No self-citations are load-bearing for uniqueness theorems or ansatzes, and experimental claims rest on benchmark evaluations rather than tautological fits. This is a standard applied contribution whose validity can be assessed externally via replication, satisfying the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the approach implicitly assumes prompt engineering can produce reliable logical triplets, but this is not formalized.

pith-pipeline@v0.9.0 · 5573 in / 999 out tokens · 27474 ms · 2026-05-10T15:40:23.990312+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DASH-KV: Accelerating Long-Context LLM Inference via Asymmetric KV Cache Hashing

    cs.CL 2026-04 unverdicted novelty 6.0

    DASH-KV accelerates long-context LLM inference to linear complexity via asymmetric KV cache hashing and mixed-precision retention, matching full attention performance on LongBench.

  2. CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

    cs.AI 2026-04 unverdicted novelty 5.0

    CAP-CoT uses iterative adversarial prompt cycles to improve CoT accuracy, stability, and robustness across six benchmarks and four LLM backbones.

Reference graph

Works this paper leans on

60 extracted references · 30 canonical work pages · cited by 2 Pith papers · 13 internal anchors

  1. [1]

    Sbi-rag: Enhancing math word problem solving for students through schema-based instruction and retrieval-augmented generation,

    P. Dixit and T. Oates, “Sbi-rag: Enhancing math word problem solving for students through schema-based instruction and retrieval-augmented generation,”arXiv preprint arXiv:2410.13293, 2024

  2. [2]

    Retrieval-augmented generation to improve math question- answering: Trade-offs between groundedness and human preference,

    Z. Levonian, C. Li, W. Zhu, A. Gade, O. Henkel, M.-E. Postle, and W. Xing, “Retrieval-augmented generation to improve math question- answering: Trade-offs between groundedness and human preference,” arXiv preprint arXiv:2310.03184, 2023

  3. [3]

    Retrieval-Augmented Generation for AI-Generated Content: A Survey

    P. Zhao, H. Zhang, Q. Yu, Z. Wang, Y . Geng, F. Fu, L. Yang, W. Zhang, J. Jiang, and B. Cui, “Retrieval-augmented generation for ai-generated content: A survey,”arXiv preprint arXiv:2402.19473, 2024

  4. [4]

    Automated feedback for student math responses based on multi-modality and fine-tuning,

    H. Li, C. Li, W. Xing, S. Baral, and N. Heffernan, “Automated feedback for student math responses based on multi-modality and fine-tuning,” in Proceedings of the 14th learning analytics and knowledge conference, 2024, pp. 763–770

  5. [5]

    Lightweight LLM Agent Memory with Small Language Models

    J. Zhang, C. Zhang, S. Chen, Z. Huang, P. Zheng, Z. Wang, P. Guo, F. Mo, S.-H. Bae, J. Zouet al., “Lightweight llm agent memory with small language models,”arXiv preprint arXiv:2604.07798, 2026

  6. [6]

    Autoformalization with large language models,

    Y . Wu, A. Q. Jiang, W. Li, M. Rabe, C. Staats, M. Jamnik, and C. Szegedy, “Autoformalization with large language models,”Advances in neural information processing systems, vol. 35, pp. 32 353–32 368, 2022

  7. [7]

    Premise selection in natural language mathematical texts,

    D. Ferreira and A. Freitas, “Premise selection in natural language mathematical texts,” inProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 7365–7374

  8. [8]

    preprint arXiv:2305.12524 , year=

    W. Chen, M. Yin, M. Ku, P. Lu, Y . Wan, X. Ma, J. Xu, X. Wang, and T. Xia, “Theoremqa: A theorem-driven question answering dataset,” arXiv preprint arXiv:2305.12524, 2023

  9. [9]

    org/abs/2412.16075

    K. Yang, G. Poesia, J. He, W. Li, K. Lauter, S. Chaudhuri, and D. Song, “Formal mathematical reasoning: A new frontier in ai,”arXiv preprint arXiv:2412.16075, 2024

  10. [10]

    arXiv preprint arXiv:2602.09794 , year=

    J. Zhang, C. Zhang, S. Chen, X. Wang, Z. Huang, P. Zheng, S. Yuan, S. Zheng, Q. Sun, J. Zouet al., “Learning global hypothesis space for en- hancing synergistic reasoning chain,”arXiv preprint arXiv:2602.09794, 2026

  11. [11]

    TDA-RC: Task-Driven Alignment for Knowledge-Based Reasoning Chains in Large Language Models

    J. Zhang, Q. Sun, C. Zhang, X. Wang, Z. Huang, Y . Zhou, P. Zheng, C.-l. A. Tai, S.-H. Bae, Z. Maet al., “Tda-rc: Task-driven alignment for knowledge-based reasoning chains in large language models,”arXiv preprint arXiv:2604.04942, 2026

  12. [12]

    arXiv preprint arXiv:2602.09821 , year=

    J. Zhang, C. Zhang, S. Chen, Y . Liu, C. Li, Q. Sun, S. Yuan, F. D. Puspitasari, D. Han, G. Wanget al., “Text summarization via global structure awareness,”arXiv preprint arXiv:2602.09821, 2026

  13. [13]

    Llava-fa: Learning fourier approximation for compressing large multimodal models

    P. Zheng, C. Zhang, J. Mo, G. Li, J. Zhang, J. Zhang, S. Cao, S. Zheng, C. Qin, G. Wanget al., “Llava-fa: Learning fourier ap- proximation for compressing large multimodal models,”arXiv preprint arXiv:2602.00135, 2026

  14. [14]

    arXiv preprint arXiv:2603.13394 , year=

    S. Cao, J. Zhang, P. Zheng, J. Yan, C. Qin, Y . Ye, W. Dong, P. Wang, Y . Yang, and C. Zhang, “Language-guided token compression with reinforcement learning in large vision-language models,”arXiv preprint arXiv:2603.13394, 2026

  15. [15]

    arXiv preprint arXiv:2508.01782 , year=

    P. Zheng, X. Pu, K. Chen, J. Huang, M. Yang, B. Feng, Y . Ren, J. Jiang, C. Zhang, Y . Yanget al., “Joint lossless compression and steganography for medical images via large language models,”arXiv preprint arXiv:2508.01782, 2025. 12

  16. [16]

    MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms

    A. Amini, S. Gabriel, P. Lin, R. Koncel-Kedziorski, Y . Choi, and H. Ha- jishirzi, “Mathqa: Towards interpretable math word problem solving with operation-based formalisms,”arXiv preprint arXiv:1905.13319, 2019

  17. [17]

    Retrieval-Augmented Generation for Large Language Models: A Survey

    Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, H. Wang, and H. Wang, “Retrieval-augmented generation for large language models: A survey,”arXiv preprint arXiv:2312.10997, vol. 2, 2023

  18. [18]

    The lean theorem prover (system description),

    L. De Moura, S. Kong, J. Avigad, F. Van Doorn, and J. von Raumer, “The lean theorem prover (system description),” inInternational Conference on Automated Deduction. Springer, 2015, pp. 378–388

  19. [19]

    The role of the mizar mathe- matical library for interactive proof development in mizar,

    G. Bancerek, C. Byli ´nski, A. Grabowski, A. Korniłowicz, R. Ma- tuszewski, A. Naumowicz, and K. P ˛ ak, “The role of the mizar mathe- matical library for interactive proof development in mizar,”Journal of Automated Reasoning, vol. 61, no. 1, pp. 9–32, 2018

  20. [20]

    The coq proof assistant reference manual: Version 6.1,

    B. Barras, S. Boutin, C. Cornes, J. Courant, J.-C. Filliatre, E. Gimenez, H. Herbelin, G. Huet, C. Munoz, C. Murthyet al., “The coq proof assistant reference manual: Version 6.1,” Ph.D. dissertation, Inria, 1997

  21. [21]

    Self-Correcting RAG: Enhancing Faithfulness via MMKP Context Selection and NLI-Guided MCTS

    S. Xu, Z. Wu, X. Jia, Y . Wang, K. Liu, and A. X. Dong, “Self-correcting rag: Enhancing faithfulness via mmkp context selection and nli-guided mcts,” 2026. [Online]. Available: https://arxiv.org/abs/2604.10734

  22. [22]

    The Power of Scale for Parameter-Efficient Prompt Tuning

    B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for parameter-efficient prompt tuning,”arXiv preprint arXiv:2104.08691, 2021

  23. [23]

    How Much Knowledge Can You Pack Into the Parameters of a Language Model?

    A. Roberts, C. Raffel, and N. Shazeer, “How much knowledge can you pack into the parameters of a language model?”arXiv preprint arXiv:2002.08910, 2020

  24. [24]

    Retrieval- augmented generation for knowledge-intensive nlp tasks,

    P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschelet al., “Retrieval- augmented generation for knowledge-intensive nlp tasks,”Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020

  25. [25]

    Retrieval augmented language model pre-training,

    K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang, “Retrieval augmented language model pre-training,” inInternational conference on machine learning. PMLR, 2020, pp. 3929–3938

  26. [26]

    Improving language models by retrieving from trillions of tokens,

    S. Borgeaud, A. Mensch, J. Hoffmann, T. Cai, E. Rutherford, K. Milli- can, G. B. Van Den Driessche, J.-B. Lespiau, B. Damoc, A. Clarket al., “Improving language models by retrieving from trillions of tokens,” in International conference on machine learning. PMLR, 2022, pp. 2206– 2240

  27. [27]

    Dense passage retrieval for open-domain question answering

    V . Karpukhin, B. Oguz, S. Min, P. S. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih, “Dense passage retrieval for open-domain question answering.” inEMNLP (1), 2020, pp. 6769–6781

  28. [28]

    Billion-scale similarity search with gpus,

    J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with gpus,”IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535–547, 2019

  29. [29]

    Colbert: Efficient and effective passage search via contextualized late interaction over bert,

    O. Khattab and M. Zaharia, “Colbert: Efficient and effective passage search via contextualized late interaction over bert,” inProceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020, pp. 39–48

  30. [30]

    2007.01282 , archivePrefix=

    G. Izacard and E. Grave, “Leveraging passage retrieval with gener- ative models for open domain question answering,”arXiv preprint arXiv:2007.01282, 2020

  31. [31]

    Unifying large language models and knowledge graphs: A roadmap,

    S. Pan, L. Luo, Y . Wang, C. Chen, J. Wang, and X. Wu, “Unifying large language models and knowledge graphs: A roadmap,”IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 7, pp. 3580–3599, 2024

  32. [32]

    Fine tuning vs. retrieval augmented generation for less popular knowledge,

    H. Soudani, E. Kanoulas, and F. Hasibi, “Fine tuning vs. retrieval augmented generation for less popular knowledge,” inProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, 2024, pp. 12–22

  33. [33]

    Active retrieval augmented generation,

    Z. Jiang, F. F. Xu, L. Gao, Z. Sun, Q. Liu, J. Dwivedi-Yu, Y . Yang, J. Callan, and G. Neubig, “Active retrieval augmented generation,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 7969–7992

  34. [34]

    Rider: Reader-guided passage reranking for open-domain question answering,

    Y . Mao, P. He, X. Liu, Y . Shen, J. Gao, J. Han, and W. Chen, “Rider: Reader-guided passage reranking for open-domain question answering,” arXiv preprint arXiv:2101.00294, 2021

  35. [35]

    Knowpo: Knowledge-aware preference optimization for con- trollable knowledge selection in retrieval-augmented language models,

    R. Zhang, Y . Xu, Y . Xiao, R. Zhu, X. Jiang, X. Chu, J. Zhao, and Y . Wang, “Knowpo: Knowledge-aware preference optimization for con- trollable knowledge selection in retrieval-augmented language models,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 24, 2025, pp. 25 895–25 903

  36. [36]

    Replug: Retrieval-augmented black-box language models

    W. Shi, S. Min, M. Yasunaga, M. Seo, R. James, M. Lewis, L. Zettle- moyer, and W.-t. Yih, “Replug: Retrieval-augmented black-box language models,”arXiv preprint arXiv:2301.12652, 2023

  37. [37]

    R. S. Sutton, A. G. Bartoet al.,Reinforcement learning: An introduction. MIT press Cambridge, 1998, vol. 1, no. 1

  38. [38]

    Augmented language models: a survey, 2023

    G. Mialon, R. Dessì, M. Lomeli, C. Nalmpantis, R. Pasunuru, R. Raileanu, B. Rozière, T. Schick, J. Dwivedi-Yu, A. Celikyil- mazet al., “Augmented language models: a survey,”arXiv preprint arXiv:2302.07842, 2023

  39. [39]

    Graphllm: Boosting graph reasoning ability of large language model,

    Z. Chai, T. Zhang, L. Wu, K. Han, X. Hu, X. Huang, and Y . Yang, “Graphllm: Boosting graph reasoning ability of large language model,” IEEE Transactions on Big Data, 2025

  40. [40]

    Towards visual chain-of-thought reasoning: A comprehensive survey,

    P. Zheng, C. Zhang, M. Cui, G. Chen, Q. Sun, J. Huang, J. Zhang, T.-H. Kim, C. Qin, Y . Renet al., “Towards visual chain-of-thought reasoning: A comprehensive survey,” 2026

  41. [41]

    Multi-view few-shot reasoning for emerging entities in knowledge graphs,

    C. Yan, F. Zhao, X. Tao, and X. Zhu, “Multi-view few-shot reasoning for emerging entities in knowledge graphs,”IEEE Transactions on Big Data, 2024

  42. [42]

    Edugraph: Learning path- based hypergraph neural networks for mooc course recommendation,

    M. Li, Z. Li, C. Huang, Y . Jiang, and X. Wu, “Edugraph: Learning path- based hypergraph neural networks for mooc course recommendation,” IEEE Transactions on Big Data, 2024

  43. [43]

    Learning causal chain graph structure via alternate learning and double pruning,

    S. Yang, F. Cao, K. Yu, and J. Liang, “Learning causal chain graph structure via alternate learning and double pruning,”IEEE Transactions on Big Data, vol. 10, no. 4, pp. 442–456, 2023

  44. [44]

    Prefix-Tuning: Optimizing Continuous Prompts for Generation

    X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,”arXiv preprint arXiv:2101.00190, 2021

  45. [45]

    Gpt understands, too,

    X. Liu, Y . Zheng, Z. Du, M. Ding, Y . Qian, Z. Yang, and J. Tang, “Gpt understands, too,”AI Open, vol. 5, pp. 208–215, 2024

  46. [46]

    Parameter-efficient fine-tuning in large language models: a survey of methodologies,

    L. Wang, S. Chen, L. Jiang, S. Pan, R. Cai, S. Yang, and F. Yang, “Parameter-efficient fine-tuning in large language models: a survey of methodologies,”Artificial Intelligence Review, vol. 58, no. 8, p. 227, 2025

  47. [47]

    arXiv preprint arXiv:2110.07602 , year=

    X. Liu, K. Ji, Y . Fu, W. L. Tam, Z. Du, Z. Yang, and J. Tang, “P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks,”arXiv preprint arXiv:2110.07602, 2021

  48. [48]

    arXiv preprint arXiv:2601.17089 , year=

    Q. Sun, C. Zhang, J. Zhang, X. Wang, J. Xie, P. Zheng, H. Wang, S. Lee, C.-l. A. Tai, Y . Yanget al., “Grasp: Guided region-aware sparse prompting for adapting mllms to remote sensing,”arXiv preprint arXiv:2601.17089, 2026

  49. [49]

    LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

    Y . Bai, X. Lv, J. Zhang, H. Lyu, J. Tang, Z. Huang, Z. Du, X. Liu, A. Zeng, L. Houet al., “Longbench: A bilingual, multitask benchmark for long context understanding,”arXiv preprint arXiv:2308.14508, 2023

  50. [50]

    HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

    Z. Yang, P. Qi, S. Zhang, Y . Bengio, W. W. Cohen, R. Salakhutdinov, and C. D. Manning, “Hotpotqa: A dataset for diverse, explainable multi-hop question answering,”arXiv preprint arXiv:1809.09600, 2018

  51. [51]

    Constructing a multi-hop qa dataset for comprehensive evaluation of reasoning steps.arXiv preprint arXiv:2011.01060, 2020

    X. Ho, A.-K. D. Nguyen, S. Sugawara, and A. Aizawa, “Constructing a multi-hop qa dataset for comprehensive evaluation of reasoning steps,” arXiv preprint arXiv:2011.01060, 2020

  52. [52]

    Musique: Multihop questions via single-hop question composition,

    H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal, “Musique: Multihop questions via single-hop question composition,”Transactions of the Association for Computational Linguistics, vol. 10, pp. 539–554, 2022

  53. [53]

    Natural questions: a benchmark for question answering research,

    T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. Devlin, K. Leeet al., “Natural questions: a benchmark for question answering research,”Transactions of the Association for Computational Linguistics, vol. 7, pp. 453–466, 2019

  54. [54]

    SQuAD: 100,000+ Questions for Machine Comprehension of Text

    P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine comprehension of text,”arXiv preprint arXiv:1606.05250, 2016

  55. [55]

    Knowledge graph-guided retrieval augmented generation,

    X. Zhu, Y . Xie, Y . Liu, Y . Li, and W. Hu, “Knowledge graph-guided retrieval augmented generation,” inProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Compu- tational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025, pp. 8912–8924

  56. [56]

    React: Synergizing reasoning and acting in language models,

    S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “React: Synergizing reasoning and acting in language models,” in International Conference on Learning Representations (ICLR), 2023

  57. [57]

    Com- plement lexical retrieval model with semantic residual embeddings,

    L. Gao, Z. Dai, T. Chen, Z. Fan, B. Van Durme, and J. Callan, “Com- plement lexical retrieval model with semantic residual embeddings,” in European Conference on Information Retrieval. Springer, 2021, pp. 146–160

  58. [58]

    LightRAG: Simple and Fast Retrieval-Augmented Generation

    Z. Guo, L. Xia, Y . Yu, T. Ao, and C. Huang, “Lightrag: Simple and fast retrieval-augmented generation,”arXiv preprint arXiv:2410.05779, 2024

  59. [59]

    From Local to Global: A Graph RAG Approach to Query-Focused Summarization

    D. Edge, H. Trinh, N. Cheng, J. Bradley, A. Chao, A. Mody, S. Truitt, D. Metropolitansky, R. O. Ness, and J. Larson, “From local to global: A graph rag approach to query-focused summarization,”arXiv preprint arXiv:2404.16130, 2024

  60. [60]

    Lora: Low-rank adaptation of large language models

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chenet al., “Lora: Low-rank adaptation of large language models.” ICLR, vol. 1, no. 2, p. 3, 2022