Agentic GraphRAG: Navigating Unstructured Financial Data with Collaborative AI
Pith reviewed 2026-05-21 00:40 UTC · model grok-4.3
The pith
An agentic GraphRAG system that merges structured registry records with LLM-extracted legal text outperforms standard vector-RAG on multi-hop and conversational queries.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a three-phase pipeline of deterministic strong-node ingestion, LLM-based weak-node extraction from unstructured notices, and identity resolution produces a usable knowledge graph; when paired with a modular agent that performs zero-shot intent routing, bounded reflection, tool-mediated graph access, and state-aware response synthesis, the resulting agentic GraphRAG system delivers higher correctness, answer relevance, information recall, turn success rate, and context carryover accuracy than a standard agentic vector-RAG baseline across automated, human-curated, and conversational benchmarks on the Swiss Official Gazette of Commerce.
What carries the argument
The analytical modular agent that integrates zero-shot intent routing, a bounded reflection loop, secure tool-mediated graph access, and state-aware response synthesis, operating over a hybrid Neo4j knowledge graph built from structured fields and LLM-extracted weak nodes.
If this is right
- The system supports more accurate multi-hop, temporal, and entity-centric investigations in public commercial registries than vector-only methods.
- A human-in-the-loop dashboard exposes evidence and execution traces, enabling transparency and auditability for expert users.
- The modular architecture transfers to other commercial gazettes and public-sector registry systems with similar mixed structured-unstructured data.
- Performance advantages appear consistently across automated, human-curated, and multi-turn conversational evaluation tiers.
Where Pith is reading between the lines
- Hybrid graph construction that grounds LLM extractions in verified structured data may reduce hallucination risks in regulatory or legal retrieval tasks.
- The framework could extend to other mixed-data domains such as medical claims or scientific patent records where entity resolution matters.
- Replacing parts of the reflection loop with additional deterministic checks might further lower dependence on LLM quality without losing flexibility.
Load-bearing premise
The LLM extraction of entities and relations from unstructured legal notices must be accurate and complete enough that errors do not propagate through downstream agent queries and degrade final answers.
What would settle it
A manual review of the extracted weak nodes showing precision below the level needed for reliable multi-hop paths, followed by re-running the benchmarks where the GraphRAG system no longer outperforms the vector-RAG baseline on correctness or recall.
Figures
read the original abstract
We present a collaborative agentic GraphRAG framework for expert analysis of commercial registry data. Public registries are often formally accessible, yet difficult to use in practice because they combine structured records with large volumes of unstructured legal text. This limits conventional keyword and vector-only retrieval, especially for multi-hop, temporal, and entity-centric investigations. Our approach builds a Neo4j knowledge graph through a three-phase pipeline: (i) deterministic ingestion of strong nodes from verified structured fields, (ii) LLM-based extraction of weak nodes from unstructured notices, and (iii) deterministic identity resolution and deduplication. On top of this graph, we introduce an analytical modular agent that integrates zero-shot intent routing, a bounded reflection loop, secure tool-mediated graph access, and state-aware response synthesis. A human-in-the-loop dashboard exposes evidence and execution traces to support transparency and auditability. We evaluate the framework on the Swiss Official Gazette of Commerce, a multilingual corpus of more than seven million publications over seven years. We further contribute a multi-tier evaluation protocol covering entity-resolution precision, tool-routing behavior, answer quality, and multi-turn conversational performance. Across automated, human-curated, and conversational benchmarks, the proposed agentic GraphRAG system consistently outperforms a standard agentic vector-RAG baseline, with strong gains in correctness, answer relevance, information recall, turn success rate, and context carryover accuracy. The architecture is modular, reproducible, and transferable to other commercial gazettes and public-sector registry systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents Agentic GraphRAG, a collaborative agentic framework for expert analysis of commercial registry data combining structured and unstructured text. It constructs a Neo4j knowledge graph via a three-phase pipeline—deterministic ingestion of strong nodes from verified structured fields, LLM-based extraction of weak nodes from unstructured legal notices, and deterministic identity resolution and deduplication—then deploys an analytical modular agent with zero-shot intent routing, bounded reflection loop, secure tool-mediated graph access, and state-aware synthesis, supported by a human-in-the-loop dashboard. Evaluation on the Swiss Official Gazette of Commerce (over seven million multilingual publications) uses a multi-tier protocol covering entity-resolution precision, tool-routing, answer quality, and multi-turn performance; the agentic GraphRAG system is reported to consistently outperform a standard agentic vector-RAG baseline on correctness, answer relevance, information recall, turn success rate, and context carryover accuracy.
Significance. If the empirical claims hold, the work offers a modular, auditable architecture that could improve handling of multi-hop, temporal, and entity-centric queries over conventional vector RAG in regulatory and financial domains. The explicit separation of deterministic strong nodes from LLM-derived weak nodes, combined with the human-in-the-loop dashboard for traceability, addresses practical deployment concerns. The multi-tier evaluation protocol spanning automated, human-curated, and conversational benchmarks is a constructive contribution for assessing both retrieval and agentic components.
major comments (2)
- [three-phase pipeline description] Three-phase pipeline, phase (ii): the central claim that the hybrid graph supplies measurably better multi-hop and entity-centric evidence than pure vector retrieval rests on the quality of LLM-extracted weak nodes. No ablation isolating performance when the graph is built from strong nodes alone, nor any error-propagation analysis or controlled degradation of weak-node quality, is described. This leaves open whether downstream gains in correctness and recall survive when extraction errors occur.
- [evaluation protocol] Evaluation section / multi-tier protocol: the abstract asserts consistent outperformance with strong gains across metrics, yet supplies no quantitative results, confidence intervals, statistical tests, data-split details, or exclusion rules. Without these, the magnitude and reliability of improvements in answer relevance, turn success rate, and context carryover cannot be verified from the text.
minor comments (3)
- [analytical modular agent] Clarify how the bounded reflection loop interacts with the state-aware synthesis step to prevent infinite loops or context drift in multi-turn conversations.
- [entity-resolution precision metric] Add explicit comparison of entity-resolution precision against a non-LLM baseline to quantify the incremental value of the LLM extraction step.
- [conclusion / transferability paragraph] The claim of transferability to other commercial gazettes would benefit from a brief discussion of language-specific or jurisdiction-specific adaptations required for the deterministic ingestion and identity-resolution phases.
Simulated Author's Rebuttal
Thank you for the detailed review and constructive feedback on our manuscript. We address each of the major comments below and outline the revisions we plan to make.
read point-by-point responses
-
Referee: [three-phase pipeline description] Three-phase pipeline, phase (ii): the central claim that the hybrid graph supplies measurably better multi-hop and entity-centric evidence than pure vector retrieval rests on the quality of LLM-extracted weak nodes. No ablation isolating performance when the graph is built from strong nodes alone, nor any error-propagation analysis or controlled degradation of weak-node quality, is described. This leaves open whether downstream gains in correctness and recall survive when extraction errors occur.
Authors: We acknowledge the importance of demonstrating the specific contribution of the LLM-extracted weak nodes. While our evaluation compares the full agentic GraphRAG system against an agentic vector-RAG baseline, which indirectly highlights the benefits of the graph structure including both strong and weak nodes, we agree that a dedicated ablation would provide stronger evidence. In the revised manuscript, we will include an ablation study comparing the full hybrid graph to a version built solely from strong nodes. Additionally, we will add an error-propagation analysis by introducing controlled noise into the weak node extraction and measuring impact on downstream metrics. revision: yes
-
Referee: [evaluation protocol] Evaluation section / multi-tier protocol: the abstract asserts consistent outperformance with strong gains across metrics, yet supplies no quantitative results, confidence intervals, statistical tests, data-split details, or exclusion rules. Without these, the magnitude and reliability of improvements in answer relevance, turn success rate, and context carryover cannot be verified from the text.
Authors: We thank the referee for pointing this out. The full paper contains detailed results from the multi-tier evaluation in Section 5, including specific performance numbers. However, to make the claims more verifiable directly from the abstract and to enhance transparency, we will revise the abstract to include key quantitative results with confidence intervals where applicable. We will also expand the evaluation section to explicitly include statistical tests, data-split details, and exclusion rules. This will allow readers to better assess the reliability of the reported improvements. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper presents an empirical framework consisting of a three-phase pipeline (deterministic strong-node ingestion, LLM weak-node extraction, deterministic deduplication) followed by an agentic query system evaluated on the external Swiss Official Gazette corpus against a distinct vector-RAG baseline. Central claims rest on reported gains in correctness, relevance, recall, and conversational metrics across automated, human-curated, and multi-turn benchmarks. No equations, self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation. The evaluation protocol and real-world corpus supply independent falsifiability outside any internal construction, rendering the results self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM-based extraction from unstructured legal notices yields sufficiently accurate entities and relations for graph construction
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
three-phase pipeline: (i) deterministic ingestion of strong nodes from verified structured fields, (ii) LLM-based extraction of weak nodes from unstructured notices, and (iii) deterministic identity resolution
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
agentic reflection loop... bounded... state machine... Tool Selection Accuracy, Fallback Activation Rate
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
Aditi Singh, Abul Ehtesham, Saket Kumar, and Tala Talaei Khoei. Agentic retrieval-augmented generation: A survey on agentic rag.arXiv preprint arXiv:2501.09136, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[2]
From llm reasoning to autonomous ai agents: A comprehensive review, 2026
Mohamed Amine Ferrag, Norbert Tihanyi, and Merouane Debbah. From llm reasoning to autonomous ai agents: A comprehensive review, 2026
work page 2026
-
[3]
Xishi Zhu, Xiaoming Guo, Shengting Cao, Shenglin Li, and Jiaqi Gong. Structugraphrag: Structured document- informed knowledge graphs for retrieval-augmented generation.Proceedings of the AAAI Symposium Series, 4(1): 242–251, Nov. 2024. doi: 10.1609/aaaiss.v4i1.31798
-
[4]
Dong Li, Yichen Niu, Ying Ai, Xiang Zou, Biqing Qi, and Jianxing Liu. T-grag: A dynamic graphrag framework for resolving temporal conflicts and redundancy in knowledge retrieval. InProceedings of the 33rd ACM International Conference on Multimedia, MM ’25, page 11880–11889, New York, NY , USA, 2025. Association for Computing Machinery. ISBN 9798400720352....
-
[5]
Tereza Cahlikova and Vincent Mabillard. Open data and transparency: Opportunities and challenges in the swiss context.Public Performance & Management Review, 43(3):662–686, 2020. doi: 10.1080/15309576.2019. 1657914
-
[6]
Sebastian Fritz-Morgenthal, Bernhard Hein, and Jochen Papenbrock. Financial risk management and explainable, trustworthy, responsible ai.Frontiers in Artificial Intelligence, V olume 5 - 2022, 2022. ISSN 2624-8212. doi: 10.3389/frai.2022.779799
-
[7]
Ian Staley. The role of explainable ai in enhancing trust and decision-making in financial services.Journal of Applied Finance & Banking, 15(5):49–62, 2025
work page 2025
-
[8]
Judging llm-as-a-judge with mt-bench and chatbot arena
Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, Hao Zhang, Joseph E Gonzalez, and Ion Stoica. Judging llm-as-a-judge with mt-bench and chatbot arena. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors,Advances in Neural Information Processing Sys...
work page 2023
-
[9]
Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xuehao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, Saizhuo Wang, Kun Zhang, Zhouchi Lin, Bowen Zhang, Lionel Ni, Wen Gao, Yuanzhuo Wang, and Jian Guo. A survey on llm-as-a-judge.The Innovation, page 101253, 2026. ISSN 2666-6758. doi: https://doi.org/10.1016/j.xinn.2025.101253
-
[10]
Effective automatic feature engineering on financial statements for bankruptcy prediction
Xinlin Wang, Zsófia Kräussl, Maciej Zurad, and Mats Brorsson. Effective automatic feature engineering on financial statements for bankruptcy prediction. In2023 3rd International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), pages 1–8, 2023. doi: 10.1109/ICECCME57830. 2023.10252608
-
[11]
Efstathios Kirkos, Charalambos Spathis, and Yannis Manolopoulos. Data mining techniques for the detection of fraudulent financial statements.Expert Systems with Applications, 32(4):995–1003, 2007. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2006.02.016
-
[12]
Hongkyu Jo, Ingoo Han, and Hoonyoung Lee. Bankruptcy prediction using case-based reasoning, neural networks, and discriminant analysis.Expert Systems with Applications, 13(2):97–108, 1997. ISSN 0957-4174. doi: https://doi.org/10.1016/S0957-4174(97)00011-0
-
[13]
Feature-weighted counterfactual-based explanation for bankruptcy prediction
Soo Hyun Cho and Kyung shik Shin. Feature-weighted counterfactual-based explanation for bankruptcy prediction. Expert Systems with Applications, 216:119390, 2023. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2022. 119390
-
[14]
Jiaming Liu, Chengzhang Li, Peng Ouyang, Jiajia Liu, and Chong Wu. Interpreting the prediction results of the tree-based gradient boosting models for financial distress prediction with an explainable machine learning approach.Journal of Forecasting, 42(5):1112–1137, 2023. doi: https://doi.org/10.1002/for.2931
-
[15]
Retrieval-augmented generation for knowledge-intensive nlp tasks
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive nlp tasks. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Informa...
work page 2020
-
[16]
Ivan Iaroshev, Ramalingam Pillai, Leandro Vaglietti, and Thomas Hanne. Evaluating retrieval-augmented generation models for financial report question and answering.Applied Sciences, 14(20), 2024. ISSN 2076-3417. doi: 10.3390/app14209318
-
[17]
Finsage: A multi-aspect rag system for financial filings question answering
Xinyu Wang, Jijun Chi, Zhenghan Tai, Tung Sum Thomas Kwok, Hailin He, Zhuhong Li, Yuchen Hua, Muzhi Li, Peng Lu, Suyucheng Wang, Yihong Wu, Huang Jerry, Jingrui Tian, Fengran Mo, Yufei Cui, and Ling Zhou. Finsage: A multi-aspect rag system for financial filings question answering. InProceedings of the 34th ACM International Conference on Information and K...
-
[18]
Hierfinrag—hierarchical multimodal rag for financial document understanding.Informatics, 13(2), 2026
Quang-Vinh Dang, Ngoc-Son-An Nguyen, and Thi-Bich-Diem V o. Hierfinrag—hierarchical multimodal rag for financial document understanding.Informatics, 13(2), 2026. ISSN 2227-9709. doi: 10.3390/informatics13020030
-
[19]
Large language models in finance: A survey,
Boyu Zhang, Hongyang Yang, Tianyu Zhou, Muhammad Ali Babar, and Xiao-Yang Liu. Enhancing financial sentiment analysis via retrieval augmented large language models. InProceedings of the Fourth ACM International Conference on AI in Finance, ICAIF ’23, page 349–356, New York, NY , USA, 2023. Association for Computing Machinery. ISBN 9798400702402. doi: 10.1...
-
[20]
Xiaoyi Fu, Xinqi Ren, Ole J. Mengshoel, and Xindong Wu. Stochastic optimization for market return prediction using financial knowledge graph. In2018 IEEE International Conference on Big Knowledge (ICBK), pages 25–32,
-
[21]
In2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)
doi: 10.1109/ICBK.2018.00012. 26 Agentic GraphRAG
-
[22]
Knowledge graph-based event embedding framework for financial quantitative investments
Dawei Cheng, Fangzhou Yang, Xiaoyang Wang, Ying Zhang, and Liqing Zhang. Knowledge graph-based event embedding framework for financial quantitative investments. InProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’20, page 2221–2230, New York, NY , USA, 2020. Association for Computing Ma...
-
[23]
Mark Weber, Giacomo Domeniconi, Jie Chen, Daniel Karl I Weidele, Claudio Bellei, Tom Robinson, and Charles E Leiserson. Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics.arXiv preprint arXiv:1908.02591, 2019
-
[24]
Flowseries: flow analysis on financial networks.Applied Network Science, 10(1):28, 2025
Arthur Capozzi, Salvatore Vilella, Dario Moncalvo, Marco Fornasiero, Valeria Ricci, Silvia Ronchiadin, and Giancarlo Ruffo. Flowseries: flow analysis on financial networks.Applied Network Science, 10(1):28, 2025. doi: 10.1007/s41109-025-00711-0
-
[25]
Exploiting graph metrics to detect anomalies in cross-country money transfer temporal networks
Salvatore Vilella, Arthur Thomas Edward Capozzi Lupi, Giancarlo Ruffo, Marco Fornasiero, Dario Moncalvo, Valeria Ricci, and Silvia Ronchiadin. Exploiting graph metrics to detect anomalies in cross-country money transfer temporal networks. InCompanion Proceedings of the ACM Web Conference 2023, WWW ’23 Companion, page 1245–1248, New York, NY , USA, 2023. A...
-
[26]
Salvatore Vilella, Arthur Capozzi, Marco Fornasiero, Dario Moncalvo, Valeria Ricci, Silvia Ronchiadin, and Giancarlo Ruffo. Weirdnodes: centrality based anomaly detection on temporal networks for the anti-financial crime domain.Applied Network Science, 10(1):14, 2025. doi: 10.1007/s41109-025-00702-1
-
[27]
Temporal relational ranking for stock prediction.ACM Trans
Fuli Feng, Xiangnan He, Xiang Wang, Cheng Luo, Yiqun Liu, and Tat-Seng Chua. Temporal relational ranking for stock prediction.ACM Trans. Inf. Syst., 37(2), March 2019. ISSN 1046-8188. doi: 10.1145/3309547
-
[28]
Arthur Capozzi and Damian Dailisan. Beyond the tax haven: a graph analysis of business attraction in swiss municipalities.EPJ Data Science, 15, 2026. doi: 10.1140/epjds/s13688-026-00619-4
-
[29]
Large language models in finance: A survey,
Andy Chung and Kumiko Tanaka-Ishii. Modeling momentum spillover with economic links discovered from financial documents. InProceedings of the Fourth ACM International Conference on AI in Finance, ICAIF ’23, page 490–497, New York, NY , USA, 2023. Association for Computing Machinery. ISBN 9798400702402. doi: 10.1145/3604237.3626862
-
[30]
In-depth analysis of graph-based rag in a unified framework.Proc
Yingli Zhou, Yaodong Su, Youran Sun, Shu Wang, Taotao Wang, Runyuan He, Yongwei Zhang, Sicong Liang, Xilin Liu, Yuchi Ma, and Yixiang Fang. In-depth analysis of graph-based rag in a unified framework.Proc. VLDB Endow., 18(13):5623–5637, January 2026. ISSN 2150-8097. doi: 10.14778/3773731.3773738
-
[31]
From local to global: A graph rag approach to query-focused summarization
Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. From local to global: A graph rag approach to query-focused summarization. 2025
work page 2025
-
[32]
GRAG: Graph retrieval-augmented generation
Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, and Liang Zhao. GRAG: Graph retrieval-augmented generation. In Luis Chiruzzo, Alan Ritter, and Lu Wang, editors,Findings of the Association for Computa- tional Linguistics: NAACL 2025, pages 4145–4157, Albuquerque, New Mexico, April 2025. Association for Computational Linguistics. ISBN 979-8-89176-19...
-
[33]
Francesco Piccialli, Diletta Chiaro, Sundas Sarwar, Donato Cerciello, Pian Qi, and Valeria Mele. Agentai: A comprehensive survey on autonomous agents in distributed ai for industry 4.0.Expert Systems with Applications, 291:128404, 2025. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2025.128404
-
[34]
Aparna Krishna Bhat and Gokulram Krishnan. A review of agentic artificial intelligence: Power of self-driven ai in the future of financial autonomy and enhanced customer engagement. In2025 3rd International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), pages 1160–1165, 2025. doi: 10.1109/ ICSCDS65426.2025.11167368
-
[35]
Necessary but Not Perfect: Changes in AI Perception at a Large University
Nir Kshetri. The rise of agentic ai in finance: Opportunities, risks, and human-centric integration.IT Professional, 27(4):19–24, 2025. doi: 10.1109/MITP.2025.3585227
-
[36]
Ibrahim A. Elgendy, Mohamed Y .I. Helal, Mohammed A. Al-Sharafi, Mousa Ahmed Albashrawi, Mohammad S. Al-Ahmadi, Il Jeon, and Yogesh K. Dwivedi. Agentic systems as catalysts for innovation in fintech: exploring opportunities, challenges and a research agenda.Information Discovery and Delivery, 05 2025. ISSN 2398-6247. doi: 10.1108/IDD-03-2025-0068
-
[37]
Maryan Rizinski and Dimitar Trajanov. Ai agents in finance and fintech: A scientific review of agent-based systems, applications, and future horizons.Computers, Materials and Continua, 86(1):1–34, 2025. ISSN 1546-2218. doi: https://doi.org/10.32604/cmc.2025.069678. 27 Agentic GraphRAG
-
[38]
Enhancing investment analysis: Optimizing ai-agent collaboration in financial research
Xuewen Han, Neng Wang, Shangkun Che, Hongyang Yang, Kunpeng Zhang, and Sean Xin Xu. Enhancing investment analysis: Optimizing ai-agent collaboration in financial research. InProceedings of the 5th ACM International Conference on AI in Finance, ICAIF ’24, page 538–546, New York, NY , USA, 2024. Association for Computing Machinery. ISBN 9798400710810. doi: ...
-
[39]
Self-RAG: Learning to retrieve, gen- erate, and critique through self-reflection
Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi. Self-RAG: Learning to retrieve, gen- erate, and critique through self-reflection. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[40]
RAG-critic: Leveraging automated critic-guided agentic workflow for retrieval augmented generation
Guanting Dong, Jiajie Jin, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, and Ji-Rong Wen. RAG-critic: Leveraging automated critic-guided agentic workflow for retrieval augmented generation. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar, editors,Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (...
-
[41]
Association for Computational Linguistics. ISBN 979-8-89176-251-0. doi: 10.18653/v1/2025.acl-long.179
-
[42]
Ruichen Zhang, Hongyang Du, Yinqiu Liu, Dusit Niyato, Jiawen Kang, Sumei Sun, Xuemin Shen, and H. Vincent Poor. Interactive ai with retrieval-augmented generation for next generation networking.IEEE Network, 38(6): 414–424, 2024. doi: 10.1109/MNET.2024.3401159
-
[43]
Hm-rag: Hierarchical multi-agent multimodal retrieval augmented generation
Pei Liu, Xin Liu, Ruoyu Yao, Junming Liu, Siyuan Meng, Ding Wang, and Jun Ma. Hm-rag: Hierarchical multi-agent multimodal retrieval augmented generation. MM ’25, page 2781–2790, New York, NY , USA, 2025. Association for Computing Machinery. ISBN 9798400720352. doi: 10.1145/3746027.3754761
-
[44]
Fangqun Gao, Shu Xu, Weiyan Hao, and Tao Lu. Ka-rag: Integrating knowledge graphs and agentic retrieval- augmented generation for an intelligent educational question-answering model.Applied Sciences, 15(23), 2025. ISSN 2076-3417. doi: 10.3390/app152312547
-
[45]
Keying hash functions for message authentication
Mihir Bellare, Ran Canetti, and Hugo Krawczyk. Keying hash functions for message authentication. In Neal Koblitz, editor,Advances in Cryptology — CRYPTO ’96, pages 1–15, Berlin, Heidelberg, 1996. Springer Berlin Heidelberg. ISBN 978-3-540-68697-2
work page 1996
-
[46]
Cypher: An evolving query language for property graphs
Nadime Francis, Alastair Green, Paolo Guagliardo, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Stefan Plantikow, Mats Rydberg, Petra Selmer, and Andrés Taylor. Cypher: An evolving query language for property graphs. InProceedings of the 2018 International Conference on Management of Data, SIGMOD ’18, page 1433–1445, New York, NY , USA, 2018. Associat...
-
[47]
React: Synergizing reasoning and acting in language models
Shunyu Yao, Jeffrey Zhao, Dian Yu, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models. InNeurIPS 2022 Foundation Models for Decision Making Workshop, 2022
work page 2022
-
[48]
verbose database queries correlate with null results
Kunlun Zhu, Zijia Liu, Bingxuan Li, Muxin Tian, Yingxuan Yang, Jiaxun Zhang, Pengrui Han, Qipeng Xie, Fuyang Cui, Weijia Zhang, et al. Where llm agents fail and how they can learn from failures.arXiv preprint arXiv:2509.25370, 2025
-
[49]
Siva Reddy, Danqi Chen, and Christopher D. Manning. Coqa: A conversational question answering challenge. Transactions of the Association for Computational Linguistics, 7:249–266, 05 2019. ISSN 2307-387X. doi: 10.1162/tacl_a_00266
-
[50]
URL https:// doi.org/10.18653/v1/d18-1241
Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. QuAC: Question answering in context. In Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii, editors,Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2174–2184, Brussels, Belgium, October-No...
-
[51]
Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers
Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. InAdvances in Neural Information Processing Systems (NeurIPS), volume 33, pages 5776–5788, 2020
work page 2020
-
[52]
RAGA s: Automated evaluation of retrieval augmented generation
Shahul Es, Jithin James, Luis Espinosa Anke, and Steven Schockaert. RAGAs: Automated evaluation of retrieval augmented generation. In Nikolaos Aletras and Orphee De Clercq, editors,Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 150–158, St. Julians, Malta, March 202...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.