PFAgent: A Tractable and Self-Evolving Power-Flow Agent for Interactive Grid Analysis

Brian Chen; Buxin She; Fangxing Li; Luanzheng Guo

arxiv: 2604.10846 · v1 · submitted 2026-04-12 · 📡 eess.SY · cs.SY

PFAgent: A Tractable and Self-Evolving Power-Flow Agent for Interactive Grid Analysis

Buxin She , Brian Chen , Luanzheng Guo , Fangxing Li This is my paper

Pith reviewed 2026-05-10 15:10 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords power flow agentself-evolving AIinteractive grid analysispower system simulationN-1 contingencyvoltage violation analysisAI in power systemsreproducible analysis

0 comments

The pith

PFAgent automates power system simulations using an interactive self-evolving AI agent

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Power system engineers often spend significant time translating their analysis goals into code, running simulations, and making sense of the outputs. This paper presents PFAgent as a way to automate these steps with an AI agent that understands natural language intents, uses power flow tools, generates reports, and improves itself through feedback. The agent was tested on standard IEEE power grid models where it handled tasks including changing analysis cases, checking for voltage violations, running N-1 contingency studies, producing plots, and giving concise summaries along with full execution logs for reproducibility. A sympathetic reader would care because this could make advanced grid analysis available without requiring extensive coding skills or expert supervision.

Core claim

The central claim is that PFAgent, through its tractable interactive architecture for intent parsing, knowledge retrieval, tool execution and reporting, combined with a self-evolution mechanism of verification-driven refinement and human-in-the-loop feedback plus an AI-assisted debugging loop, successfully automates multiple power flow analysis tasks on IEEE benchmark systems while ensuring convergence validity, numerical consistency, and explanation quality with transparent logs.

What carries the argument

the combination of a tractable interactive architecture and a verification-driven self-evolution mechanism with human feedback

If this is right

It can automate case changes, voltage violation analysis, N-1 contingency analysis, plot generation, and summary creation.
The agent produces reproducible results with transparent execution logs.
The approach supports an evaluation framework assessing task success, convergence, consistency, and explanation quality.
It represents a move toward interactive and self-evolving agents instead of conventional simulation tools.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such agents could enable faster iteration in power system planning by allowing engineers to describe studies in plain language.
The self-evolution feature might lead to agents that adapt to specific user preferences or regional grid characteristics over time.
This framework could be extended to other types of power system studies like optimal power flow or dynamic simulations.

Load-bearing premise

The AI-assisted evaluation and debugging loop together with human-in-the-loop feedback will produce reliable self-evolution without persistent errors or excessive oversight.

What would settle it

Observing whether PFAgent correctly performs N-1 contingency analysis on an IEEE test system and returns consistent results with proper voltage violation detection on repeated trials without additional human corrections.

Figures

Figures reproduced from arXiv: 2604.10846 by Brian Chen, Buxin She, Fangxing Li, Luanzheng Guo.

**Figure 1.** Figure 1: Technical framework of PFAgent: the left column (blue) is the session query pipeline; the right column (gray) contains the feedback loops for error repair and self-evolution. are the numbers computed by ANDES. This design serves two purposes: it gives the user a concise summary of the study result, and it gives the evaluator a structured output to score. Plot files are captured from the session workspace a… view at source ↗

**Figure 2.** Figure 2: Self-evolution mechanism. six dimensions introduced in Section IV, including format, grounding, continuity, execution, semantic correctness, and output quality. This yields a per-turn pass/fail label together with a list of specific failure categories. The resulting failure records enter the shared processing pipeline described below. 2) Deployment Stage: After release, the same structured logging pipeline… view at source ↗

**Figure 3.** Figure 3: AI-assisted fixing loop. its turn scores. The six dimensions and their point allocations are as follows. 1) Format (10 points): As shown in (1), this dimension checks whether the response contains exactly one fenced Python code block. If the response omits code, includes conflicting scripts, or returns results without executable content when code was requested, the format score is zero. Sfmt = ( 10, if ex… view at source ↗

**Figure 4.** Figure 4: User interface of PFAgent: (a) Configuration panels; (b) Chat panels. 6) Self-Evolution Effect [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: 100-scenario benchmark results: (a) Scenario pass rate by mode. (b) Turn-level pass rate; (c) Dimension-level [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 8.** Figure 8: Self-evolution before/after comparison on the 164- [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 7.** Figure 7: Family-level scenario pass rate for the Fine-tuned [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

read the original abstract

Power system simulation workflows remain expert-intensive. Engineers must translate study intents into code or API calls, execute analyses, and interpret outputs. To automate this workflow, this paper presents PFAgent, a tractable and self-evolving power-flow agent for interactive grid analysis. PFAgent integrates four key capabilities: i) a tractable and interactive architecture for intent parsing, knowledge retrieval, tool execution, and structured reporting; ii) a self-evolution mechanism combining verification-driven refinement and human-in-the-loop feedback; iii) an AI-assisted evaluation and debugging loop that leverages conversational context, generated code, and execution errors for iterative fixing; and iv) an evaluation framework covering task success, convergence validity, numerical consistency, and explanation quality. Verification on IEEE benchmark systems shows that PFAgent can automate case change, analyze voltage violations, perform N-1 contingency analysis, generate plots and concise summaries, and return reproducible results with transparent execution logs. The proposed framework highlights a shift from conventional simulation tools to interactive, tractable, and self-evolving agents for power system analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PFAgent wraps standard LLM agent patterns around power-flow tasks and shows basic demos on IEEE cases, but the self-evolution claims lack the metrics needed to hold up.

read the letter

The key takeaway is that PFAgent demonstrates an LLM agent handling power system tasks like changing cases, spotting voltage issues, running N-1 analysis, and making plots, all with logs for reproducibility. It uses a mix of intent parsing, tool calls, and a feedback loop for improvement. On the positive side, the work brings together standard agent pieces in a way tailored to grid analysis. The evaluation framework looks at success, validity, consistency, and quality of explanations, which is a solid way to think about these systems. The examples on IEEE systems give a sense that it can produce usable outputs without constant manual coding. Where it gets soft is the self-evolution mechanism. The description relies on verification-driven refinement and human feedback, but the paper does not include data on how often the loop needs human help, whether errors actually decrease over iterations, or any tests showing autonomous progress. That leaves the self-evolving and tractable parts without strong support, especially since power flow problems can be sensitive to small mistakes. This kind of paper is for people in power systems who want to see how AI agents might fit into their daily analysis tools. It could also interest AI researchers building agents for other engineering fields. The thinking is clear on the architecture, so it merits a serious referee to push for more complete results and comparisons. I would send it to peer review, mainly to see if the authors can fill in the performance metrics and address the oversight requirements.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces PFAgent, a tractable and self-evolving power-flow agent for interactive grid analysis. It integrates an interactive architecture for intent parsing, knowledge retrieval, tool execution, and structured reporting; a self-evolution mechanism combining verification-driven refinement and human-in-the-loop feedback; an AI-assisted evaluation and debugging loop leveraging conversational context, generated code, and execution errors; and an evaluation framework covering task success, convergence validity, numerical consistency, and explanation quality. Verification on IEEE benchmark systems is claimed to demonstrate automation of case changes, voltage violation analysis, N-1 contingency analysis, plot generation, concise summaries, and reproducible results with transparent execution logs.

Significance. If the self-evolution and tractability claims hold with supporting data, the work could meaningfully advance automation of expert-intensive power system workflows by enabling interactive, LLM-based agents for tasks such as contingency analysis. The focus on verification-driven refinement and transparent logs addresses practical needs in grid analysis. No machine-checked proofs or parameter-free derivations are present, but the emphasis on reproducible logs is a positive step toward falsifiable evaluation.

major comments (2)

[Evaluation framework] Evaluation framework (as described in the abstract and § on verification): the claims of successful automation on IEEE benchmarks for voltage violations, N-1 analysis, and reproducibility are stated without any quantitative results, success rates, error metrics, or detailed methods. This is load-bearing for the central tractability assertion.
[Self-evolution mechanism] Self-evolution mechanism (abstract and description of AI-assisted loop): the mechanism is defined via verification-driven refinement plus human-in-the-loop feedback, but no metrics on intervention frequency, error persistence across cycles, or ablation studies showing net autonomous improvement are provided. In numerical power-flow tasks where convergence and N-1 validity are safety-critical, this absence directly weakens the 'self-evolving' and 'tractable' descriptors.

minor comments (2)

[Abstract] The abstract states that PFAgent 'returns reproducible results with transparent execution logs' but does not define the logging format, reproducibility protocol, or how logs enable independent verification.
[Architecture description] The four key capabilities are listed but lack a clear diagram or pseudocode showing data flow between intent parsing, tool execution, and the debugging loop.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our evaluation and self-evolution claims. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: [Evaluation framework] Evaluation framework (as described in the abstract and § on verification): the claims of successful automation on IEEE benchmarks for voltage violations, N-1 analysis, and reproducibility are stated without any quantitative results, success rates, error metrics, or detailed methods. This is load-bearing for the central tractability assertion.

Authors: We agree that the current manuscript presents verification primarily through descriptive case studies on IEEE benchmarks rather than aggregated quantitative metrics. This limits the strength of the tractability claims. In the revised version, we will add a quantitative evaluation subsection reporting task success rates, convergence validity percentages, numerical consistency error metrics, and a clear description of the evaluation methodology across the tested cases. revision: yes
Referee: [Self-evolution mechanism] Self-evolution mechanism (abstract and description of AI-assisted loop): the mechanism is defined via verification-driven refinement plus human-in-the-loop feedback, but no metrics on intervention frequency, error persistence across cycles, or ablation studies showing net autonomous improvement are provided. In numerical power-flow tasks where convergence and N-1 validity are safety-critical, this absence directly weakens the 'self-evolving' and 'tractable' descriptors.

Authors: The manuscript illustrates the self-evolution mechanism via the AI-assisted loop and human feedback with concrete examples, but we acknowledge the absence of quantitative metrics on intervention frequency, error persistence, and ablation studies. We will revise to include experimental statistics on refinement cycles and error resolution patterns. Ablation studies comparing variants with and without self-evolution will be added where data from our existing runs permit; full new experiments will be noted as future work if time-constrained. revision: partial

Circularity Check

0 steps flagged

No circularity: descriptive architecture with external benchmark verification

full rationale

The paper describes an agent architecture and self-evolution mechanism using standard components (intent parsing, tool execution, verification-driven refinement, human-in-the-loop feedback) without any equations, fitted parameters, or mathematical derivations. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked. Claims of tractability and self-evolution are supported by external IEEE benchmark verification rather than reducing to inputs by construction. This is self-contained against external benchmarks with no reduction of predictions to fitted quantities.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The system rests on domain assumptions about LLM reliability for code generation and error correction in technical domains; no free parameters or invented physical entities are introduced.

axioms (1)

domain assumption Large language models can reliably parse engineering intents and generate executable power-system code when given appropriate tools and context.
Invoked in the description of the tractable architecture and AI-assisted debugging loop.

pith-pipeline@v0.9.0 · 5496 in / 1173 out tokens · 23348 ms · 2026-05-10T15:10:28.234081+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

[1]

Virtual inertia scheduling (VIS) for real-time economic dispatch of ibr-penetrated power systems,

B. She, F. Li, H. Cui, J. Wang, Q. Zhang, and R. Bo, “Virtual inertia scheduling (VIS) for real-time economic dispatch of ibr-penetrated power systems,”IEEE Transactions on Sustainable Energy, vol. 15, no. 2, pp. 938–951, 2023

work page 2023
[2]

Power systems resilience assessment: Hardening and smart operational enhancement strategies,

M. Panteli, P. Mancarella, D. N. Trakas, E. Kyriakides, and N. D. Hatziargyriou, “Power systems resilience assessment: Hardening and smart operational enhancement strategies,”Proceedings of the IEEE, vol. 105, no. 7, pp. 1202–1213, 2017

work page 2017
[3]

Recalibrating global data center energy-use estimates,

E. Masanet, A. Shehabi, N. Lei, S. Smith, and J. Koomey, “Recalibrating global data center energy-use estimates,”Science, vol. 367, no. 6481, pp. 984–986, 2020

work page 2020
[4]

Achieving a 100% renewable grid: Operating electric power systems with extremely high levels of variable renewable energy,

B. Kroposki, B. Johnson, Y . Zhang, V . Gevorgian, P. Denholm, B.-M. Hodge, and B. Hannegan, “Achieving a 100% renewable grid: Operating electric power systems with extremely high levels of variable renewable energy,”IEEE Power and Energy Magazine, vol. 15, no. 2, pp. 61–73, 2017

work page 2017
[5]

Hybrid symbolic-numerical modeling and parametric stability analysis of DC- AC power systems,

B. She, R. R. Hossain, S. Kundu, M. Elizondo, and V . Adetola, “Hybrid symbolic-numerical modeling and parametric stability analysis of DC- AC power systems,”IEEE Open Access Journal of Power and Energy, 2026

work page 2026
[6]

Leveraging large language model based agent for automated electricity market modelling and simulation,

Y . Cheng, W. Liu, Y . Xue, J. Huang, J. Zhao, and F. Wen, “Leveraging large language model based agent for automated electricity market modelling and simulation,”Journal of Modern Power Systems and Clean Energy, 2025

work page 2025
[7]

Large language model-based power dispatch agent: Framework, ap- plication and challenges,

H. Zhao, Y . Cheng, D. Xiang, X. Zhou, J. Zhao, X. Cai, and Z. Dong, “Large language model-based power dispatch agent: Framework, ap- plication and challenges,”International Journal of Electrical Power & Energy Systems, vol. 175, p. 111653, 2026

work page 2026
[8]

Gridmind: Llms-powered agents for power system analysis and operations,

H. Jin, K. Kim, and J. Kwon, “Gridmind: Llms-powered agents for power system analysis and operations,” inProceedings of the SC’25 Workshops of the International Conference for High Performance Com- puting, Networking, Storage and Analysis, 2025, pp. 560–568

work page 2025
[9]

Exploring the capabilities and limitations of large language models in the electric energy sector,

S. Majumder, L. Dong, F. Doudi, Y . Cai, C. Tian, D. Kalathil, K. Ding, A. A. Thatte, N. Li, and L. Xie, “Exploring the capabilities and limitations of large language models in the electric energy sector,”Joule, vol. 8, no. 6, pp. 1544–1549, 2024

work page 2024
[10]

Large foundation models for power systems,

C. Huang, S. Li, R. Liu, H. Wang, and Y . Chen, “Large foundation models for power systems,” in2024 IEEE Power & Energy Society General Meeting (PESGM). IEEE, 2024, pp. 1–5

work page 2024
[11]

Fault diagnosis in power grids with large language model,

J. Liu and A. Rahman, “Fault diagnosis in power grids with large language model,”arXiv preprint arXiv:2407.08836, 2024

work page arXiv 2024
[12]

ChatGPT and other large language models for cybersecurity of smart grid applications,

A. Zaboli, S. L. Choi, T.-J. Song, and J. Hong, “ChatGPT and other large language models for cybersecurity of smart grid applications,” in 2024 IEEE Power & Energy Society General Meeting (PESGM). IEEE, 2024, pp. 1–5

work page 2024
[13]

Carbon footprint accounting driven by large language models and retrieval-augmented generation.arXiv preprint arXiv:2408.09713, 2024

H. Wang, M. Zhang, Z. Chen, N. Shang, S. Yao, F. Wen, and J. Zhao, “Carbon footprint accounting driven by large language models and retrieval-augmented generation,”arXiv preprint arXiv:2408.09713, 2024

work page arXiv 2024
[14]

On the potential of chatgpt to generate distribution systems for load flow studies using OpenDSS,

R. S. Bonadia, F. C. Trindade, W. Freitas, and B. Venkatesh, “On the potential of chatgpt to generate distribution systems for load flow studies using OpenDSS,”IEEE Transactions on Power Systems, vol. 38, no. 6, pp. 5965–5968, 2023

work page 2023
[15]

Applying large language models to power systems: Potential security threats,

J. Ruan, G. Liang, H. Zhao, G. Liu, X. Sun, J. Qiu, Z. Xu, F. Wen, and Z. Y . Dong, “Applying large language models to power systems: Potential security threats,”arXiv preprint arXiv:2311.13361, 2024

work page arXiv 2024
[16]

Enabling large language models to perform power system simulations with previously unseen tools: A case of Daline,

M. Jia, Z. Cui, and G. Hug, “Enabling large language models to perform power system simulations with previously unseen tools: A case of Daline,”arXiv preprint arXiv:2406.17215, 2024

work page arXiv 2024
[17]

Enhancing LLMs for power system simulations: A feedback- driven multi-agent framework,

——, “Enhancing LLMs for power system simulations: A feedback- driven multi-agent framework,”IEEE Transactions on Smart Grid, 2025

work page 2025
[18]

Retrieval-augmented generation for knowledge-intensive NLP tasks,

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K¨uttler, M. Lewis, W.-t. Yih, T. Rockt¨aschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Advances in Neural Information Processing Systems, 2020

work page 2020
[19]

ReAct: Synergizing reasoning and acting in language models,

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” in International Conference on Learning Representations, 2023

work page 2023
[20]

Hybrid symbolic-numeric framework for power system modeling and analysis,

H. Cui, F. Li, and K. Tomsovic, “Hybrid symbolic-numeric framework for power system modeling and analysis,”IEEE Transactions on Power Systems, vol. 36, no. 2, pp. 1373–1384, 2021

work page 2021
[21]

PFAgent: A tractable and self- evolving power-flow agent for interactive grid analysis,

B. She, B. Chen, L. Guo, and F. Li, “PFAgent: A tractable and self- evolving power-flow agent for interactive grid analysis,” https://github. com/shebuxin/pfagent, 2026

work page 2026

[1] [1]

Virtual inertia scheduling (VIS) for real-time economic dispatch of ibr-penetrated power systems,

B. She, F. Li, H. Cui, J. Wang, Q. Zhang, and R. Bo, “Virtual inertia scheduling (VIS) for real-time economic dispatch of ibr-penetrated power systems,”IEEE Transactions on Sustainable Energy, vol. 15, no. 2, pp. 938–951, 2023

work page 2023

[2] [2]

Power systems resilience assessment: Hardening and smart operational enhancement strategies,

M. Panteli, P. Mancarella, D. N. Trakas, E. Kyriakides, and N. D. Hatziargyriou, “Power systems resilience assessment: Hardening and smart operational enhancement strategies,”Proceedings of the IEEE, vol. 105, no. 7, pp. 1202–1213, 2017

work page 2017

[3] [3]

Recalibrating global data center energy-use estimates,

E. Masanet, A. Shehabi, N. Lei, S. Smith, and J. Koomey, “Recalibrating global data center energy-use estimates,”Science, vol. 367, no. 6481, pp. 984–986, 2020

work page 2020

[4] [4]

Achieving a 100% renewable grid: Operating electric power systems with extremely high levels of variable renewable energy,

B. Kroposki, B. Johnson, Y . Zhang, V . Gevorgian, P. Denholm, B.-M. Hodge, and B. Hannegan, “Achieving a 100% renewable grid: Operating electric power systems with extremely high levels of variable renewable energy,”IEEE Power and Energy Magazine, vol. 15, no. 2, pp. 61–73, 2017

work page 2017

[5] [5]

Hybrid symbolic-numerical modeling and parametric stability analysis of DC- AC power systems,

B. She, R. R. Hossain, S. Kundu, M. Elizondo, and V . Adetola, “Hybrid symbolic-numerical modeling and parametric stability analysis of DC- AC power systems,”IEEE Open Access Journal of Power and Energy, 2026

work page 2026

[6] [6]

Leveraging large language model based agent for automated electricity market modelling and simulation,

Y . Cheng, W. Liu, Y . Xue, J. Huang, J. Zhao, and F. Wen, “Leveraging large language model based agent for automated electricity market modelling and simulation,”Journal of Modern Power Systems and Clean Energy, 2025

work page 2025

[7] [7]

Large language model-based power dispatch agent: Framework, ap- plication and challenges,

H. Zhao, Y . Cheng, D. Xiang, X. Zhou, J. Zhao, X. Cai, and Z. Dong, “Large language model-based power dispatch agent: Framework, ap- plication and challenges,”International Journal of Electrical Power & Energy Systems, vol. 175, p. 111653, 2026

work page 2026

[8] [8]

Gridmind: Llms-powered agents for power system analysis and operations,

H. Jin, K. Kim, and J. Kwon, “Gridmind: Llms-powered agents for power system analysis and operations,” inProceedings of the SC’25 Workshops of the International Conference for High Performance Com- puting, Networking, Storage and Analysis, 2025, pp. 560–568

work page 2025

[9] [9]

Exploring the capabilities and limitations of large language models in the electric energy sector,

S. Majumder, L. Dong, F. Doudi, Y . Cai, C. Tian, D. Kalathil, K. Ding, A. A. Thatte, N. Li, and L. Xie, “Exploring the capabilities and limitations of large language models in the electric energy sector,”Joule, vol. 8, no. 6, pp. 1544–1549, 2024

work page 2024

[10] [10]

Large foundation models for power systems,

C. Huang, S. Li, R. Liu, H. Wang, and Y . Chen, “Large foundation models for power systems,” in2024 IEEE Power & Energy Society General Meeting (PESGM). IEEE, 2024, pp. 1–5

work page 2024

[11] [11]

Fault diagnosis in power grids with large language model,

J. Liu and A. Rahman, “Fault diagnosis in power grids with large language model,”arXiv preprint arXiv:2407.08836, 2024

work page arXiv 2024

[12] [12]

ChatGPT and other large language models for cybersecurity of smart grid applications,

A. Zaboli, S. L. Choi, T.-J. Song, and J. Hong, “ChatGPT and other large language models for cybersecurity of smart grid applications,” in 2024 IEEE Power & Energy Society General Meeting (PESGM). IEEE, 2024, pp. 1–5

work page 2024

[13] [13]

Carbon footprint accounting driven by large language models and retrieval-augmented generation.arXiv preprint arXiv:2408.09713, 2024

H. Wang, M. Zhang, Z. Chen, N. Shang, S. Yao, F. Wen, and J. Zhao, “Carbon footprint accounting driven by large language models and retrieval-augmented generation,”arXiv preprint arXiv:2408.09713, 2024

work page arXiv 2024

[14] [14]

On the potential of chatgpt to generate distribution systems for load flow studies using OpenDSS,

R. S. Bonadia, F. C. Trindade, W. Freitas, and B. Venkatesh, “On the potential of chatgpt to generate distribution systems for load flow studies using OpenDSS,”IEEE Transactions on Power Systems, vol. 38, no. 6, pp. 5965–5968, 2023

work page 2023

[15] [15]

Applying large language models to power systems: Potential security threats,

J. Ruan, G. Liang, H. Zhao, G. Liu, X. Sun, J. Qiu, Z. Xu, F. Wen, and Z. Y . Dong, “Applying large language models to power systems: Potential security threats,”arXiv preprint arXiv:2311.13361, 2024

work page arXiv 2024

[16] [16]

Enabling large language models to perform power system simulations with previously unseen tools: A case of Daline,

M. Jia, Z. Cui, and G. Hug, “Enabling large language models to perform power system simulations with previously unseen tools: A case of Daline,”arXiv preprint arXiv:2406.17215, 2024

work page arXiv 2024

[17] [17]

Enhancing LLMs for power system simulations: A feedback- driven multi-agent framework,

——, “Enhancing LLMs for power system simulations: A feedback- driven multi-agent framework,”IEEE Transactions on Smart Grid, 2025

work page 2025

[18] [18]

Retrieval-augmented generation for knowledge-intensive NLP tasks,

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K¨uttler, M. Lewis, W.-t. Yih, T. Rockt¨aschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Advances in Neural Information Processing Systems, 2020

work page 2020

[19] [19]

ReAct: Synergizing reasoning and acting in language models,

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” in International Conference on Learning Representations, 2023

work page 2023

[20] [20]

Hybrid symbolic-numeric framework for power system modeling and analysis,

H. Cui, F. Li, and K. Tomsovic, “Hybrid symbolic-numeric framework for power system modeling and analysis,”IEEE Transactions on Power Systems, vol. 36, no. 2, pp. 1373–1384, 2021

work page 2021

[21] [21]

PFAgent: A tractable and self- evolving power-flow agent for interactive grid analysis,

B. She, B. Chen, L. Guo, and F. Li, “PFAgent: A tractable and self- evolving power-flow agent for interactive grid analysis,” https://github. com/shebuxin/pfagent, 2026

work page 2026