PFDelta: A Benchmark Dataset for Power Flow under Load, Generation, and Topology Variations
Pith reviewed 2026-05-18 04:05 UTC · model grok-4.3
The pith
The PFΔ benchmark provides 859,800 power flow instances to test solvers and ML methods under load, generation, topology, and contingency variations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PFΔ is a benchmark dataset for power flow that captures diverse variations in load, generation, and topology, spanning six system sizes, three contingency types, and near-infeasible points, allowing identification of limitations in current solving approaches.
What carries the argument
The PFΔ dataset itself, built by generating systematic variations in load, generation, topology, and including contingency scenarios and stability boundary cases.
If this is right
- Evaluations can guide improvements in traditional power flow algorithms for challenging cases.
- GNN methods can be refined to better handle topology changes and contingencies.
- The dataset enables systematic assessment of ML approaches for speeding up contingency analysis.
- Future work can target the open problems highlighted for more robust grid simulation tools.
Where Pith is reading between the lines
- This benchmark may standardize testing for power system ML models in a way that accelerates progress in the field.
- It could be extended to include more complex dynamics or uncertainty models from climate data.
- Adoption might lead to hybrid methods combining solvers and learning for better real-time performance.
Load-bearing premise
The synthetic variations and chosen scenarios are representative enough of real-world power system conditions to serve as a useful benchmark.
What would settle it
If tests on actual grid operational data yield different difficulty rankings for the methods than those observed on PFΔ.
Figures
read the original abstract
Power flow (PF) calculations are the backbone of real-time grid operations, across workflows such as contingency analysis (where repeated PF evaluations assess grid security under outages) and topology optimization (which involves PF-based searches over combinatorially large action spaces). Running these calculations at operational timescales or across large evaluation spaces remains a major computational bottleneck. Additionally, growing uncertainty in power system operations from the integration of renewables and climate-induced extreme weather also calls for tools that can accurately and efficiently simulate a wide range of scenarios and operating conditions. Machine learning methods offer a potential speedup over traditional solvers, but their performance has not been systematically assessed on benchmarks that capture real-world variability. This paper introduces PF$\Delta$, a benchmark dataset for power flow that captures diverse variations in load, generation, and topology. PF$\Delta$ contains 859,800 solved power flow instances spanning six different bus system sizes, capturing three types of contingency scenarios (N , N -1, and N -2), and including close-to-infeasible cases near steady-state voltage stability limits. We evaluate traditional solvers and GNN-based methods, highlighting key areas where existing approaches struggle, and identifying open problems for future research. Our dataset is available at https://huggingface.co/datasets/pfdelta/pfdelta/tree/main and our code with data generation scripts and model implementations is at https://github.com/MOSSLab-MIT/pfdelta.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces PFΔ, a benchmark dataset containing 859,800 solved power-flow instances across six bus-system sizes. The dataset incorporates controlled synthetic variations in load, generation, and topology, three contingency types (N, N-1, N-2), and operating points near steady-state voltage stability limits. It reports evaluations of conventional solvers and GNN-based methods on these instances and releases both the dataset and the generation scripts.
Significance. If the generation pipeline is fully reproducible, the public release of this large, documented collection of solved instances with explicit near-limit and contingency cases supplies a concrete, verifiable testbed for ML methods targeting power-flow bottlenecks in contingency analysis and topology optimization. The accompanying code and scripts constitute a clear strength that supports independent verification and extension.
minor comments (3)
- [§3] §3 (Data Generation): the ranges and sampling distributions used for load and generation perturbations are not stated with sufficient numerical detail; providing the exact intervals or distributions would allow exact reproduction of the reported instance counts and near-infeasibility statistics.
- [Table 1] Table 1 or equivalent summary table: the breakdown of instances by bus-system size, contingency type, and feasibility status should be presented explicitly so that readers can immediately verify the claimed totals (859,800) and the proportion of close-to-infeasible cases.
- [Evaluation] Evaluation section: the precise definition of “close-to-infeasible” (e.g., voltage magnitude or loading margin thresholds) and the infeasibility detection criterion used by the underlying solver should be stated in one place to avoid ambiguity when comparing solver and GNN performance.
Simulated Author's Rebuttal
We thank the referee for their positive summary, recognition of the dataset's significance for ML methods in power systems, and recommendation for minor revision. The assessment of reproducibility and utility for contingency analysis and topology optimization aligns with our goals. No specific major comments were listed in the report.
Circularity Check
No significant circularity; empirical dataset contribution is self-contained
full rationale
The paper's core contribution is the creation and public release of the PFΔ benchmark dataset consisting of 859,800 solved power-flow instances generated from standard test cases via controlled synthetic perturbations in load, generation, and topology, along with N-1/N-2 contingencies and near-limit points. No derivation chain, first-principles predictions, or fitted parameters are claimed; evaluations of solvers and GNN methods are empirical and independently verifiable. The generation pipeline relies on established power-flow solvers whose outputs can be reproduced externally. No self-citation load-bearing steps, self-definitional reductions, or ansatz smuggling are present. This is a standard honest finding for a dataset/benchmark paper.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard power flow equations are solved by conventional numerical methods to produce the labeled instances.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce PFΔ, a benchmark dataset for evaluating ML approaches to power flow across variations in load distributions, generator profiles, grid sizes, and N–1/N–2 topological perturbations.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Power flow (PF) calculations are the backbone of real-time grid operations... solving the nonlinear, implicit system of equations comprising (1)–(2)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The Electric Power Engineering Series
Antonio Gómez-Expósito, editor.Electric Energy Systems: Analysis and Operation. The Electric Power Engineering Series. CRC Press, 2009
work page 2009
-
[2]
Anandsingh Chauhan, Mayank Baranwal, and Ansuma Basumatary. Powrl: A reinforcement learning framework for robust management of power networks.Proceedings of the AAAI Conference on Artificial Intelligence, 37(12):14757–14764, Jun. 2023
work page 2023
-
[3]
Jonas Berg Hansen, Stian Normann Anfinsen, and Filippo Maria Bianchi. Power flow balancing with decentralized graph neural networks.IEEE Transactions on Power Systems, 38(3):2423– 2433, 2023
work page 2023
-
[4]
Balthazar Donon, Rémy Clément, Benjamin Donnot, Antoine Marot, Isabelle Guyon, and Marc Schoenauer. Neural networks for power flow: Graph neural solver.Electric Power Systems Research, 189:106547, 2020
work page 2020
-
[5]
Power to the relational inductive bias: Graph neural networks in electrical power grids
Martin Ringsquandl, Houssem Sellami, Marcel Hildebrandt, Dagmar Beyer, Sylwia Henselmeyer, Sebastian Weber, and Mitchell Joblin. Power to the relational inductive bias: Graph neural networks in electrical power grids. InProceedings of the 30th ACM International Conference on Information & Knowledge Management, CIKM ’21, page 1538–1547. ACM, October 2021
work page 2021
-
[6]
Tania B. Lopez-Garcia and José A. Domínguez-Navarro. Power flow analysis via typed graph neural networks.Engineering Applications of Artificial Intelligence, 117:105567, 2023
work page 2023
-
[7]
Nan Lin, Stavros Orfanoudakis, Nathan Ordonez Cardenas, Juan S. Giraldo, and Pedro P. Vergara. Powerflownet: Power flow approximation using message passing graph neural networks. International Journal of Electrical Power & Energy Systems, 160:110112, 2024
work page 2024
-
[8]
Adam B. Birchfield, Ti Xu, Kathleen M. Gegner, Komal S. Shetye, and Thomas J. Overbye. Grid structural characteristics as validation criteria for synthetic networks.IEEE Transactions on Power Systems, 32(4):3258–3265, 2017
work page 2017
-
[9]
Trager Joswig-Jones, Kyri Baker, and Ahmed S. Zamzam. Opf-learn: An open-source frame- work for creating representative ac optimal power flow datasets. In2022 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), page 1–5. IEEE, April 2022
work page 2022
-
[10]
Opfdata: Large-scale datasets for ac optimal power flow with topological perturbations, 2024
Sean Lovett, Miha Zgubic, Sofia Liguori, Sephora Madjiheurem, Hamish Tomlinson, Sophie Elster, Chris Apps, Sims Witherspoon, and Luis Piloto. Opfdata: Large-scale datasets for ac optimal power flow with topological perturbations, 2024
work page 2024
-
[11]
Ac power flow data in matpower and qcqp format: itesla, rte snapshots, and pegase, 2016
Cédric Josz, Stéphane Fliscounakis, Jean Maeght, and Patrick Panciatici. Ac power flow data in matpower and qcqp format: itesla, rte snapshots, and pegase, 2016
work page 2016
-
[12]
Canos: A fast and scalable neural ac-opf solver robust to n-1 perturbations, 2024
Luis Piloto, Sofia Liguori, Sephora Madjiheurem, Miha Zgubic, Sean Lovett, Hamish Tomlinson, Sophie Elster, Chris Apps, and Sims Witherspoon. Canos: A fast and scalable neural ac-opf solver robust to n-1 perturbations, 2024
work page 2024
-
[13]
Powergraph: A power grid benchmark dataset for graph neural networks
Anna Varbella, Kenza Amara, Blazhe Gjorgiev, Mennatallah El-Assady, and Giovanni Sansavini. Powergraph: A power grid benchmark dataset for graph neural networks. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 110784–110804. Curran Associates,...
work page 2024
-
[14]
J. Duncan Glover, Thomas J. Overbye, and Mulukutla S. Sarma.Power System Analysis & Design. Cengage Learning, Boston, MA, 2017
work page 2017
-
[15]
Adam B. Birchfield, Ti Xu, Kathleen M. Gegner, Komal S. Shetye, and Thomas J. Overbye. Grid Structural Characteristics as Validation Criteria for Synthetic Networks. 32(4):3258–3265
-
[16]
Sogol Babaeinejadsarookolaee, Adam Birchfield, Richard D. Christie, Carleton Coffrin, Christo- pher DeMarco, Ruisheng Diao, Michael Ferris, Stephane Fliscounakis, Scott Greene, Renke Huang, Cedric Josz, Roman Korab, Bernard Lesieutre, Jean Maeght, Terrence W. K. Mak, Daniel K. Molzahn, Thomas J. Overbye, Patrick Panciatici, Byungkwon Park, Jonathan Snod- ...
-
[17]
NERC (North American Electric Reliability Corporation), September 2015
Chuck Lawrence Michael Dantzler David Kempf Mark Tiemeier Erichsen, L.STANDARD APPLICATION GUIDE TPL-001-4 VERSION 2.0. NERC (North American Electric Reliability Corporation), September 2015
work page 2015
-
[18]
Carleton Coffrin, Russell Bent, Kaarthik Sundar, Yeesian Ng, and Miles Lubin. Powermodels. jl: An open-source framework for exploring power flow formulations. In2018 Power Systems Computation Conference (PSCC), pages 1–8, 2018
work page 2018
-
[19]
Andreas Wächter and Lorenz T. Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical Programming, 106(1):25–57, 2006
work page 2006
-
[20]
Artificial intelligence/machine learning technology in power system applications, March 2024
Yousu Chen, Xiaoyuan Fan, Renke Huang*, Qiuhua Huang*, Ang Li, and Kishan Guddanti. Artificial intelligence/machine learning technology in power system applications, March 2024
work page 2024
-
[21]
Singh, Vassilis Kekatos, and Georgios B
Manish K. Singh, Vassilis Kekatos, and Georgios B. Giannakis. Learning to solve the ac- opf using sensitivity-informed deep neural networks.IEEE Transactions on Power Systems, 37(4):2833–2846, 2022
work page 2022
-
[22]
Methods of computing steady-state voltage stability margins of power systems, 03 2018
Joe Hong Chow and Scott Gordon Ghiocel. Methods of computing steady-state voltage stability margins of power systems, 03 2018. US Patent 9,921,602
work page 2018
-
[23]
Interaction Networks for Learning about Objects, Relations and Physics
Peter W. Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, and Koray Kavukcuoglu. Interaction networks for learning about objects, relations and physics.CoRR, abs/1612.00222, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[24]
Jian Du, Shanghang Zhang, Guanhang Wu, José M. F. Moura, and Soummya Kar. Topology adaptive graph convolutional networks.CoRR, abs/1710.10370, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[25]
V . Ajjarapu and C. Christy. The continuation power flow: a tool for steady state voltage stability analysis.IEEE Transactions on Power Systems, 7(1):416–423, 1992
work page 1992
-
[26]
NOSE” event. To enrich the training set, we also include samples “approaching infeasibility
Ray D Zimmerman and Carlos E Murillo-Sánchez. MATPOWER, May 2024. 12 A Technical Appendices and Supplementary Material A.1 Close-to-Infeasible Case Generation Close-to-infeasible cases correspond to operating conditions near the steady-state voltage stability limit. Beyond this limit, no power flow solution exists, and the system is susceptible to voltage...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.