AI Economist Agent: An Agentic Framework for Model-Grounded Economic Analysis with RAG, Knowledge Graphs, and Large Language Models
Pith reviewed 2026-06-26 15:12 UTC · model grok-4.3
The pith
An AI economist agent uses LLM agents and knowledge graphs to ground economic narratives in explicit model computations and retrieved evidence.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that an agentic RAG-based framework called the AI economist agent can generate economic reports by having LLM agents orchestrate retrieval from knowledge graphs, model selection and computation, and evidence-linked narrative generation, resulting in improved economic coherence and traceability as shown in the inflation and stress-test applications.
What carries the argument
LLM-based agents that plan the analysis, retrieve relevant evidence using RAG from knowledge graphs of economic data and theory, select appropriate models, execute the computations, and generate reports linked to the evidence.
If this is right
- Grounding prevents the language model from producing quantitative claims on its own.
- The approach leads to reports with better economic coherence in the tested scenarios.
- Traceability to retrieved evidence and model computations is achieved in applications like inflation analysis and stress testing.
- The framework supports scenario analysis without direct reliance on LLM-generated numbers.
Where Pith is reading between the lines
- This could allow economists to verify AI outputs more easily by checking the linked models and data.
- Extending the knowledge graphs with more diverse economic theories might broaden the range of analyses possible.
- Testing the system on additional applications beyond inflation and banking stress could reveal its general applicability.
Load-bearing premise
LLM-based agents are able to accurately plan analyses, retrieve evidence, select models, and generate coherent reports without errors, provided the knowledge graphs contain sufficiently accurate and complete economic theory and data.
What would settle it
A demonstration that the generated reports in the U.S. inflation or bank stress-test cases contain claims inconsistent with the underlying model computations or evidence from the knowledge graphs would show the framework does not achieve the claimed grounding.
Figures
read the original abstract
We propose a model-grounded RAG-based AI economist with an agentic framework for economic scenario analysis using large language models (LLMs) and knowledge graphs. While LLMs can generate fluent economic narratives, economists are often required to make economic claims grounded by economic theory and real-world data. Based on this motivation, this study proposes an RAG-based AI economist, which utilizes knowledge graphs including economic data and theory and LLM-based agents to plan the analysis, retrieve relevant evidence, select appropriate models, and generate reports. In our framework, we do not produce quantitative claims directly with the language model alone; instead, we generate narratives grounded in explicit model-based computations and linked to the retrieved evidence via AI agents. We refer to our framework as an AI economist agent. We evaluate the AI economist agent in two applications: economist report generation for U.S. inflation persistence and Federal Reserve policy, and bank stress-test narrative generation for U.S. commercial real estate refinancing stress. The results illustrate how grounding the generated reports improves their economic coherence and traceability.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an 'AI economist agent' framework that integrates LLMs with RAG and knowledge graphs containing economic theory and data. LLM-based agents plan analyses, retrieve evidence, select models, and generate reports; narratives are produced from explicit model-based computations rather than direct LLM outputs. The framework is evaluated on two applications: economist report generation for U.S. inflation persistence and Federal Reserve policy, and bank stress-test narrative generation for U.S. commercial real estate refinancing stress. The central claim is that this grounding improves economic coherence and traceability.
Significance. If the agentic pipeline reliably executes without material errors and the claimed improvements can be demonstrated quantitatively, the work could provide a practical template for model-grounded LLM use in economics. The emphasis on explicit model computations and evidence linking addresses a recognized limitation of standalone LLMs in domain-specific analysis. However, the absence of any metrics, baselines, ablation studies, or error analysis in the manuscript makes it impossible to assess whether the framework delivers the asserted gains.
major comments (2)
- [Abstract] Abstract: the claim that the framework 'improves their economic coherence and traceability' is presented without any quantitative metrics, baselines, error rates, ablation results, or human evaluation details. This absence directly undermines the central empirical claim of the paper.
- [Applications / Evaluation] Evaluation description (applications section): the manuscript states that the AI economist agent was evaluated on U.S. inflation persistence and bank stress-test tasks but supplies no information on how coherence or traceability were measured, what comparison systems were used, or what the observed differences were.
minor comments (1)
- The manuscript introduces several new terms ('AI economist agent', 'model-grounded RAG-based AI economist') without a clear glossary or consistent usage across sections.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on the need for stronger empirical support. We agree that the manuscript would benefit from explicit quantitative evaluation details and will revise accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the framework 'improves their economic coherence and traceability' is presented without any quantitative metrics, baselines, error rates, ablation results, or human evaluation details. This absence directly undermines the central empirical claim of the paper.
Authors: We acknowledge the validity of this observation. The current abstract and results section rely on illustrative case studies rather than formal metrics. In the revised manuscript we will (i) moderate the abstract claim to reflect the qualitative nature of the presented evidence and (ii) add a new evaluation subsection that reports expert-rated coherence scores, traceability accuracy (percentage of narrative claims correctly linked to retrieved evidence and model outputs), and direct comparisons against a non-agentic LLM baseline on the same tasks. revision: yes
-
Referee: [Applications / Evaluation] Evaluation description (applications section): the manuscript states that the AI economist agent was evaluated on U.S. inflation persistence and bank stress-test tasks but supplies no information on how coherence or traceability were measured, what comparison systems were used, or what the observed differences were.
Authors: We agree that the applications section is insufficiently detailed on measurement. The original evaluation consisted of end-to-end pipeline walkthroughs demonstrating model selection and evidence grounding. The revision will expand this section to specify: (a) the coherence and traceability metrics employed (expert annotation protocol and automated link-verification rate), (b) the baseline systems (vanilla LLM prompting and simple RAG without agentic planning), and (c) the observed differences (quantitative deltas and qualitative examples of improved economic consistency). revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper proposes a new agentic framework combining RAG, knowledge graphs, and LLMs for model-grounded economic analysis. No equations, derivations, fitted parameters, or quantitative predictions appear in the abstract or described structure. The central claim of improved coherence and traceability is presented as a property of the novel construction itself rather than a result reduced to prior inputs by definition or self-citation. No load-bearing steps match any of the enumerated circularity patterns; the work is self-contained as a descriptive system proposal evaluated on two applications.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM-based agents can reliably perform planning, retrieval, model selection, and report generation tasks.
- domain assumption Knowledge graphs can store and provide accurate economic data and theory for retrieval.
invented entities (1)
-
AI economist agent
no independent evidence
Reference graph
Works this paper leans on
- [1]
-
[2]
L ight RAG : Simple and fast retrieval-augmented generation
Zirui Guo, Lianghao Xia, Yanhua Yu, Tu Ao, and Chao Huang. L ight RAG : Simple and fast retrieval-augmented generation. In Findings of the Association for Computational Linguistics (EMNLP). Association for Computational Linguistics, 2025
2025
-
[3]
Hippo RAG : Neurobiologically inspired long-term memory for large language models
Bernal Jimenez Gutierrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, and Yu Su. Hippo RAG : Neurobiologically inspired long-term memory for large language models. In Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
2024
-
[4]
Code execution as grounded supervision for llm reasoning, 2025
Dongwon Jung, Wenxuan Zhou, and Muhao Chen. Code execution as grounded supervision for llm reasoning, 2025. a rXiv: 2506.10343
-
[5]
Generative ai for economic research: Use cases and implications for economists
Anton Korinek. Generative ai for economic research: Use cases and implications for economists. Journal of Economic Literature, 61 0 (4): 0 1281–1317, 2023
2023
-
[6]
Ai agents for economic research
Anton Korinek. Ai agents for economic research. Technical report, National Bureau of Economic Research, 2025. Working Paper Series
2025
-
[7]
ATOM : A dap T ive and O pti M ized dynamic temporal knowledge graph construction using LLM s
Yassir Lairgi, Ludovic Moncla, Khalid Benabdeslem, R \'e my Cazabet, and Pierre Cl \'e au. ATOM : A dap T ive and O pti M ized dynamic temporal knowledge graph construction using LLM s. In Findings of the A ssociation for C omputational L inguistics: EACL 2026 . Association for Computational Linguistics, 2026
2026
-
[8]
Kag: Boosting llms in professional domains via knowledge augmented generation
Lei Liang, Zhongpu Bo, Zhengke Gui, Zhongshu Zhu, Ling Zhong, Peilong Zhao, Mengshu Sun, Zhiqiang Zhang, Jun Zhou, Wenguang Chen, Wen Zhang, and Huajun Chen. Kag: Boosting llms in professional domains via knowledge augmented generation. In Companion Proceedings of the ACM on Web Conference, 2025
2025
-
[9]
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
Aditi Singh, Abul Ehtesham, Saket Kumar, Tala Talaei Khoei, and Athanasios V. Vasilakos. Agentic retrieval-augmented generation: A survey on agentic rag, 2026. a rXiv: 2501.09136
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[10]
The knowledge graph for macroeconomic analysis with alternative big data, 2020
Yucheng Yang, Yue Pang, Guanhua Huang, and Weinan E. The knowledge graph for macroeconomic analysis with alternative big data, 2020. a rXiv: 2010.05172
-
[11]
Dalong Zhang, Jun Xu, Jun Zhou, Lei Liang, Lin Yuan, Ling Zhong, Mengshu Sun, Peilong Zhao, QiWei Wang, Xiaorui Wang, Xinkai Du, YangYang Hou, Yu Ao, ZhaoYang Wang, Zhengke Gui, ZhiYing Yi, Zhongpu Bo, Haofen Wang, and Huajun Chen. Kag-thinker: Interactive thinking and deep reasoning in llms via knowledge-augmented generation, 2025. a rXiv: 2506.17728
-
[12]
Parkes, and Richard Socher
Stephan Zheng, Alexander Trott, Sunil Srinivasa, David C. Parkes, and Richard Socher. The ai economist: Taxation policy design via two-level deep multiagent reinforcement learning. Science Advances, 8 0 (18), 2022
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.