pith. sign in

arxiv: 2510.22048 · v3 · submitted 2025-10-24 · 💻 cs.LG

PFDelta: A Benchmark Dataset for Power Flow under Load, Generation, and Topology Variations

Pith reviewed 2026-05-18 04:05 UTC · model grok-4.3

classification 💻 cs.LG
keywords power flowbenchmark datasetmachine learninggraph neural networkscontingency analysispower systemsvoltage stabilitygrid operations
0
0 comments X p. Extension

The pith

The PFΔ benchmark provides 859,800 power flow instances to test solvers and ML methods under load, generation, topology, and contingency variations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PFΔ to fill the gap in benchmarks for power flow calculations that account for real-world variability from renewables and extreme weather. It contains 859,800 solved instances across six bus system sizes, N, N-1, and N-2 contingencies, and close-to-infeasible cases near voltage stability limits. The authors evaluate traditional solvers and GNN-based methods on this dataset and point out areas of difficulty. A reader would care because this can help develop faster tools for grid operations and security analysis.

Core claim

PFΔ is a benchmark dataset for power flow that captures diverse variations in load, generation, and topology, spanning six system sizes, three contingency types, and near-infeasible points, allowing identification of limitations in current solving approaches.

What carries the argument

The PFΔ dataset itself, built by generating systematic variations in load, generation, topology, and including contingency scenarios and stability boundary cases.

If this is right

  • Evaluations can guide improvements in traditional power flow algorithms for challenging cases.
  • GNN methods can be refined to better handle topology changes and contingencies.
  • The dataset enables systematic assessment of ML approaches for speeding up contingency analysis.
  • Future work can target the open problems highlighted for more robust grid simulation tools.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This benchmark may standardize testing for power system ML models in a way that accelerates progress in the field.
  • It could be extended to include more complex dynamics or uncertainty models from climate data.
  • Adoption might lead to hybrid methods combining solvers and learning for better real-time performance.

Load-bearing premise

The synthetic variations and chosen scenarios are representative enough of real-world power system conditions to serve as a useful benchmark.

What would settle it

If tests on actual grid operational data yield different difficulty rankings for the methods than those observed on PFΔ.

Figures

Figures reproduced from arXiv: 2510.22048 by Alvaro Carbonero, Ana K. Rivera, Anvita Bhagavathula, Priya Donti.

Figure 1
Figure 1. Figure 1: Illustration of a generator and load in a [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Data generation process for a single data sample within [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Experimental results for all selected tasks. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Results for Task 3.1 showcasing the Power Balance Loss (PBL) on a combined feasible [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
read the original abstract

Power flow (PF) calculations are the backbone of real-time grid operations, across workflows such as contingency analysis (where repeated PF evaluations assess grid security under outages) and topology optimization (which involves PF-based searches over combinatorially large action spaces). Running these calculations at operational timescales or across large evaluation spaces remains a major computational bottleneck. Additionally, growing uncertainty in power system operations from the integration of renewables and climate-induced extreme weather also calls for tools that can accurately and efficiently simulate a wide range of scenarios and operating conditions. Machine learning methods offer a potential speedup over traditional solvers, but their performance has not been systematically assessed on benchmarks that capture real-world variability. This paper introduces PF$\Delta$, a benchmark dataset for power flow that captures diverse variations in load, generation, and topology. PF$\Delta$ contains 859,800 solved power flow instances spanning six different bus system sizes, capturing three types of contingency scenarios (N , N -1, and N -2), and including close-to-infeasible cases near steady-state voltage stability limits. We evaluate traditional solvers and GNN-based methods, highlighting key areas where existing approaches struggle, and identifying open problems for future research. Our dataset is available at https://huggingface.co/datasets/pfdelta/pfdelta/tree/main and our code with data generation scripts and model implementations is at https://github.com/MOSSLab-MIT/pfdelta.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper introduces PFΔ, a benchmark dataset containing 859,800 solved power-flow instances across six bus-system sizes. The dataset incorporates controlled synthetic variations in load, generation, and topology, three contingency types (N, N-1, N-2), and operating points near steady-state voltage stability limits. It reports evaluations of conventional solvers and GNN-based methods on these instances and releases both the dataset and the generation scripts.

Significance. If the generation pipeline is fully reproducible, the public release of this large, documented collection of solved instances with explicit near-limit and contingency cases supplies a concrete, verifiable testbed for ML methods targeting power-flow bottlenecks in contingency analysis and topology optimization. The accompanying code and scripts constitute a clear strength that supports independent verification and extension.

minor comments (3)
  1. [§3] §3 (Data Generation): the ranges and sampling distributions used for load and generation perturbations are not stated with sufficient numerical detail; providing the exact intervals or distributions would allow exact reproduction of the reported instance counts and near-infeasibility statistics.
  2. [Table 1] Table 1 or equivalent summary table: the breakdown of instances by bus-system size, contingency type, and feasibility status should be presented explicitly so that readers can immediately verify the claimed totals (859,800) and the proportion of close-to-infeasible cases.
  3. [Evaluation] Evaluation section: the precise definition of “close-to-infeasible” (e.g., voltage magnitude or loading margin thresholds) and the infeasibility detection criterion used by the underlying solver should be stated in one place to avoid ambiguity when comparing solver and GNN performance.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary, recognition of the dataset's significance for ML methods in power systems, and recommendation for minor revision. The assessment of reproducibility and utility for contingency analysis and topology optimization aligns with our goals. No specific major comments were listed in the report.

Circularity Check

0 steps flagged

No significant circularity; empirical dataset contribution is self-contained

full rationale

The paper's core contribution is the creation and public release of the PFΔ benchmark dataset consisting of 859,800 solved power-flow instances generated from standard test cases via controlled synthetic perturbations in load, generation, and topology, along with N-1/N-2 contingencies and near-limit points. No derivation chain, first-principles predictions, or fitted parameters are claimed; evaluations of solvers and GNN methods are empirical and independently verifiable. The generation pipeline relies on established power-flow solvers whose outputs can be reproduced externally. No self-citation load-bearing steps, self-definitional reductions, or ansatz smuggling are present. This is a standard honest finding for a dataset/benchmark paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard power-system modeling assumptions and the premise that the chosen synthetic variations adequately sample the space of realistic operating conditions.

axioms (1)
  • standard math Standard power flow equations are solved by conventional numerical methods to produce the labeled instances.
    Invoked throughout the dataset construction process described in the abstract.

pith-pipeline@v0.9.0 · 5796 in / 1324 out tokens · 75763 ms · 2026-05-18T04:05:22.957619+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 2 internal anchors

  1. [1]

    The Electric Power Engineering Series

    Antonio Gómez-Expósito, editor.Electric Energy Systems: Analysis and Operation. The Electric Power Engineering Series. CRC Press, 2009

  2. [2]

    Powrl: A reinforcement learning framework for robust management of power networks.Proceedings of the AAAI Conference on Artificial Intelligence, 37(12):14757–14764, Jun

    Anandsingh Chauhan, Mayank Baranwal, and Ansuma Basumatary. Powrl: A reinforcement learning framework for robust management of power networks.Proceedings of the AAAI Conference on Artificial Intelligence, 37(12):14757–14764, Jun. 2023

  3. [3]

    Power flow balancing with decentralized graph neural networks.IEEE Transactions on Power Systems, 38(3):2423– 2433, 2023

    Jonas Berg Hansen, Stian Normann Anfinsen, and Filippo Maria Bianchi. Power flow balancing with decentralized graph neural networks.IEEE Transactions on Power Systems, 38(3):2423– 2433, 2023

  4. [4]

    Neural networks for power flow: Graph neural solver.Electric Power Systems Research, 189:106547, 2020

    Balthazar Donon, Rémy Clément, Benjamin Donnot, Antoine Marot, Isabelle Guyon, and Marc Schoenauer. Neural networks for power flow: Graph neural solver.Electric Power Systems Research, 189:106547, 2020

  5. [5]

    Power to the relational inductive bias: Graph neural networks in electrical power grids

    Martin Ringsquandl, Houssem Sellami, Marcel Hildebrandt, Dagmar Beyer, Sylwia Henselmeyer, Sebastian Weber, and Mitchell Joblin. Power to the relational inductive bias: Graph neural networks in electrical power grids. InProceedings of the 30th ACM International Conference on Information & Knowledge Management, CIKM ’21, page 1538–1547. ACM, October 2021

  6. [6]

    Lopez-Garcia and José A

    Tania B. Lopez-Garcia and José A. Domínguez-Navarro. Power flow analysis via typed graph neural networks.Engineering Applications of Artificial Intelligence, 117:105567, 2023

  7. [7]

    Giraldo, and Pedro P

    Nan Lin, Stavros Orfanoudakis, Nathan Ordonez Cardenas, Juan S. Giraldo, and Pedro P. Vergara. Powerflownet: Power flow approximation using message passing graph neural networks. International Journal of Electrical Power & Energy Systems, 160:110112, 2024

  8. [8]

    Birchfield, Ti Xu, Kathleen M

    Adam B. Birchfield, Ti Xu, Kathleen M. Gegner, Komal S. Shetye, and Thomas J. Overbye. Grid structural characteristics as validation criteria for synthetic networks.IEEE Transactions on Power Systems, 32(4):3258–3265, 2017

  9. [9]

    Trager Joswig-Jones, Kyri Baker, and Ahmed S. Zamzam. Opf-learn: An open-source frame- work for creating representative ac optimal power flow datasets. In2022 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), page 1–5. IEEE, April 2022

  10. [10]

    Opfdata: Large-scale datasets for ac optimal power flow with topological perturbations, 2024

    Sean Lovett, Miha Zgubic, Sofia Liguori, Sephora Madjiheurem, Hamish Tomlinson, Sophie Elster, Chris Apps, Sims Witherspoon, and Luis Piloto. Opfdata: Large-scale datasets for ac optimal power flow with topological perturbations, 2024

  11. [11]

    Ac power flow data in matpower and qcqp format: itesla, rte snapshots, and pegase, 2016

    Cédric Josz, Stéphane Fliscounakis, Jean Maeght, and Patrick Panciatici. Ac power flow data in matpower and qcqp format: itesla, rte snapshots, and pegase, 2016

  12. [12]

    Canos: A fast and scalable neural ac-opf solver robust to n-1 perturbations, 2024

    Luis Piloto, Sofia Liguori, Sephora Madjiheurem, Miha Zgubic, Sean Lovett, Hamish Tomlinson, Sophie Elster, Chris Apps, and Sims Witherspoon. Canos: A fast and scalable neural ac-opf solver robust to n-1 perturbations, 2024

  13. [13]

    Powergraph: A power grid benchmark dataset for graph neural networks

    Anna Varbella, Kenza Amara, Blazhe Gjorgiev, Mennatallah El-Assady, and Giovanni Sansavini. Powergraph: A power grid benchmark dataset for graph neural networks. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 110784–110804. Curran Associates,...

  14. [14]

    Duncan Glover, Thomas J

    J. Duncan Glover, Thomas J. Overbye, and Mulukutla S. Sarma.Power System Analysis & Design. Cengage Learning, Boston, MA, 2017

  15. [15]

    Birchfield, Ti Xu, Kathleen M

    Adam B. Birchfield, Ti Xu, Kathleen M. Gegner, Komal S. Shetye, and Thomas J. Overbye. Grid Structural Characteristics as Validation Criteria for Synthetic Networks. 32(4):3258–3265

  16. [16]

    Sogol Babaeinejadsarookolaee, Adam Birchfield, Richard D. Christie, Carleton Coffrin, Christo- pher DeMarco, Ruisheng Diao, Michael Ferris, Stephane Fliscounakis, Scott Greene, Renke Huang, Cedric Josz, Roman Korab, Bernard Lesieutre, Jean Maeght, Terrence W. K. Mak, Daniel K. Molzahn, Thomas J. Overbye, Patrick Panciatici, Byungkwon Park, Jonathan Snod- ...

  17. [17]

    NERC (North American Electric Reliability Corporation), September 2015

    Chuck Lawrence Michael Dantzler David Kempf Mark Tiemeier Erichsen, L.STANDARD APPLICATION GUIDE TPL-001-4 VERSION 2.0. NERC (North American Electric Reliability Corporation), September 2015

  18. [18]

    Powermodels

    Carleton Coffrin, Russell Bent, Kaarthik Sundar, Yeesian Ng, and Miles Lubin. Powermodels. jl: An open-source framework for exploring power flow formulations. In2018 Power Systems Computation Conference (PSCC), pages 1–8, 2018

  19. [19]

    Andreas Wächter and Lorenz T. Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical Programming, 106(1):25–57, 2006

  20. [20]

    Artificial intelligence/machine learning technology in power system applications, March 2024

    Yousu Chen, Xiaoyuan Fan, Renke Huang*, Qiuhua Huang*, Ang Li, and Kishan Guddanti. Artificial intelligence/machine learning technology in power system applications, March 2024

  21. [21]

    Singh, Vassilis Kekatos, and Georgios B

    Manish K. Singh, Vassilis Kekatos, and Georgios B. Giannakis. Learning to solve the ac- opf using sensitivity-informed deep neural networks.IEEE Transactions on Power Systems, 37(4):2833–2846, 2022

  22. [22]

    Methods of computing steady-state voltage stability margins of power systems, 03 2018

    Joe Hong Chow and Scott Gordon Ghiocel. Methods of computing steady-state voltage stability margins of power systems, 03 2018. US Patent 9,921,602

  23. [23]

    Interaction Networks for Learning about Objects, Relations and Physics

    Peter W. Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, and Koray Kavukcuoglu. Interaction networks for learning about objects, relations and physics.CoRR, abs/1612.00222, 2016

  24. [24]

    Jian Du, Shanghang Zhang, Guanhang Wu, José M. F. Moura, and Soummya Kar. Topology adaptive graph convolutional networks.CoRR, abs/1710.10370, 2017

  25. [25]

    Ajjarapu and C

    V . Ajjarapu and C. Christy. The continuation power flow: a tool for steady state voltage stability analysis.IEEE Transactions on Power Systems, 7(1):416–423, 1992

  26. [26]

    NOSE” event. To enrich the training set, we also include samples “approaching infeasibility

    Ray D Zimmerman and Carlos E Murillo-Sánchez. MATPOWER, May 2024. 12 A Technical Appendices and Supplementary Material A.1 Close-to-Infeasible Case Generation Close-to-infeasible cases correspond to operating conditions near the steady-state voltage stability limit. Beyond this limit, no power flow solution exists, and the system is susceptible to voltage...