pith. machine review for the scientific record. sign in

arxiv: 2604.03087 · v1 · submitted 2026-04-03 · 📡 eess.SY · cs.SY

Recognition: 2 theorem links

· Lean Theorem

Self-Supervised Graph Neural Networks for Full-Scale Tertiary Voltage Control

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:09 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords graph neural networksself-supervised learningtertiary voltage controlpower system operationvoltage violationsamortized optimizationFrench transmission grid
0
0 comments X

The pith

A self-supervised graph neural network reduces voltage violations on the full-scale French power grid by learning adjustments from day-ahead forecasts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors frame tertiary voltage control as an amortized optimization task rather than a repeated mixed-integer nonlinear program. They train a graph neural network in a self-supervised manner so that its outputs directly minimize the number of voltage violations. When the model is trained on a full year of day-ahead forecasts for the French HV-EHV network, it produces setpoints and switching decisions that cut the average number of violations. A reader would care because traditional solvers are too slow for real-time use on grids of this size, while the trained network offers a fast proxy that operators could apply after the forecasting stage.

Core claim

Tertiary voltage control can be cast as an amortized optimization problem solved by training a graph neural network self-supervised to output generator setpoints and line statuses that minimize voltage violations. After training on one year of full-scale French HV-EHV day-ahead forecasts, the resulting model serves as a practical TVC proxy and lowers the average number of voltage violations.

What carries the argument

Self-supervised graph neural network trained to produce voltage-control actions that minimize violations, acting as a fast amortized solver for the TVC mixed-integer nonlinear program.

If this is right

  • The GNN can act as a rapid post-processing step after the day-ahead forecasting pipeline.
  • It scales to networks where repeated MINLP solves are computationally infeasible.
  • Operators obtain actionable setpoints without requiring optimality certificates.
  • The same training procedure can be repeated periodically as new forecast data arrives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be tested on other large-scale control tasks where approximate rather than optimal solutions are acceptable.
  • Combining the GNN with online measurements might further reduce violations beyond the forecast-only setting.
  • The learned policy might reveal simple heuristic rules that operators could apply by hand.

Load-bearing premise

That a policy trained only to reduce violations on historical forecasts will remain useful and sufficient when applied to real-time operating conditions.

What would settle it

Running the trained model on a held-out set of actual real-time French grid snapshots and finding that the number of voltage violations is not lower than the baseline forecasting pipeline.

Figures

Figures reproduced from arXiv: 2604.03087 by Balthazar Donon, Geoffroy Jamgotchian, Hugo Kulesza, Louis Wehenkel.

Figure 1
Figure 1. Figure 1: Illustration of the various levers for action. On the left part of each subfigure is displayed a [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Histogram of voltage violation counts per context, over [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Histogram of normalized voltages over all buses and [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
read the original abstract

A growing portion of operators workload is dedicated to Tertiary Voltage Control (TVC), namely the regulation of voltages by means of adjusting a series of setpoints and connection status. TVC may be framed as a Mixed Integer Non Linear Program, but state-of-the-art optimization methods scale poorly to large systems, making them impractical for real-scale and real-time decision support. Observing that TVC does not require any optimality guarantee, we frame it as an Amortized Optimization problem, addressed by the self-supervised training of a Graph Neural Network (GNN) to minimize voltage violations. As a first step, we consider the specific use case of post-processing the forecasting pipeline used by the French TSO, where the trained GNN would serve as a TVC proxy. After being trained on one year of full-scale HV-EHV French power grid day-ahead forecasts, our model manages to significantly reduce the average number of voltage violations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper frames Tertiary Voltage Control (TVC) as an amortized optimization problem and trains a self-supervised Graph Neural Network (GNN) on one year of day-ahead forecasts for the full-scale HV-EHV French power grid to minimize voltage violations, claiming that the resulting model significantly reduces the average number of voltage violations.

Significance. If the empirical results hold with proper validation, the work could enable scalable real-time TVC proxies for large transmission systems where traditional MINLP solvers are intractable, leveraging the graph structure of power networks without requiring optimality guarantees or labeled optimal solutions.

major comments (2)
  1. [Abstract] Abstract: the central claim that the model 'manages to significantly reduce the average number of voltage violations' is stated without any quantitative metrics, baselines, error bars, ablation studies, or statistical tests, rendering the magnitude and robustness of the improvement unverifiable.
  2. [Results] Results/Methods: no validation is provided that the learned policy generalizes from day-ahead forecast data to real-time operating points that include forecast errors, load/generation deviations, or topology changes, which is load-bearing for the practical TVC use case.
minor comments (1)
  1. [Abstract] Abstract: the description of the training objective as 'self-supervised' would benefit from an explicit equation or loss formulation even at a high level to clarify how voltage-violation count is differentiated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We address each major comment point by point below, indicating the revisions we will make to strengthen the paper.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the model 'manages to significantly reduce the average number of voltage violations' is stated without any quantitative metrics, baselines, error bars, ablation studies, or statistical tests, rendering the magnitude and robustness of the improvement unverifiable.

    Authors: We agree that the abstract would be strengthened by including quantitative support for the central claim. In the revised version, we will update the abstract to report the specific average reduction in voltage violations (including the numerical value and percentage improvement relative to the unprocessed forecast baseline), along with a brief reference to the error bars and comparison setup from the results section. This change will make the magnitude of the improvement verifiable without altering the paper's length significantly. revision: yes

  2. Referee: [Results] Results/Methods: no validation is provided that the learned policy generalizes from day-ahead forecast data to real-time operating points that include forecast errors, load/generation deviations, or topology changes, which is load-bearing for the practical TVC use case.

    Authors: The work is explicitly scoped to the use case of post-processing day-ahead forecasts within the French TSO's forecasting pipeline, as stated in the abstract and introduction. Training and evaluation are performed exclusively on historical day-ahead forecast data to demonstrate the amortized self-supervised approach for that setting. We do not claim or evaluate generalization to real-time operating points that include additional forecast errors, load/generation deviations, or topology changes, as those would constitute a distinct operational scenario requiring separate datasets and protocols. We will revise the discussion and conclusion sections to explicitly delineate this scope, acknowledge the limitation for broader real-time TVC applications, and identify generalization testing as valuable future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity in self-supervised training pipeline

full rationale

The paper frames TVC as amortized optimization and trains a GNN self-supervised on external one-year day-ahead forecast data to minimize a differentiable proxy for voltage violations. The reported reduction is an empirical performance metric on that data distribution, not a quantity that reduces by construction to the training inputs or to any self-citation. No equations redefine the output as the input, no fitted parameter is relabeled as a prediction, and the central claim rests on standard ML generalization rather than a load-bearing self-citation chain or imported uniqueness theorem. The derivation is therefore self-contained against the provided forecast data.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The claim rests on the standard graph representation of power grids and the assumption that a learned policy minimizing violations is practically useful; no new physical entities are introduced.

free parameters (1)
  • GNN architecture and training hyperparameters
    Model weights and optimizer settings are fitted during self-supervised training on the forecast dataset.
axioms (1)
  • domain assumption Power grid can be represented as an undirected graph with buses as nodes and lines as edges for message passing.
    Invoked when applying GNN to the French transmission network topology.

pith-pipeline@v0.9.0 · 5467 in / 1261 out tokens · 48391 ms · 2026-05-13T19:09:21.851448+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    PowSyBl (Power System Blocks), a Power System Toolbox

    Committers of PowSyBl, “PowSyBl (Power System Blocks), a Power System Toolbox.” [Online]. Available: https://github.com/powsybl/

  2. [2]

    Essays on the ACOPF Problem: Formulations, Approxi- mations, and Applications in the Electricity Markets,

    A. R. Castillo, “Essays on the ACOPF Problem: Formulations, Approxi- mations, and Applications in the Electricity Markets,” Ph.D. dissertation, Johns Hopkins University, 2016

  3. [3]

    A Particle Swarm Optimization for Reactive Power and V oltage Control in Electric Power Systems,

    Y . Fukuyama and H. Yoshida, “A Particle Swarm Optimization for Reactive Power and V oltage Control in Electric Power Systems,” in Congress on Evolutionary Computation, 2001

  4. [4]

    Goodfellow, Y

    I. Goodfellow, Y . Bengio, A. Courville, and Y . Bengio,Deep Learning, 1st ed. MIT Press Cambridge, 2016

  5. [5]

    Approximation Capabilities of Multilayer Feedforward Networks,

    K. Hornik, “Approximation Capabilities of Multilayer Feedforward Networks,”Neural networks, 1991

  6. [6]

    Deep Reinforcement Learning for Electric Transmission V oltage Control,

    B. L. Thayer and T. J. Overbye, “Deep Reinforcement Learning for Electric Transmission V oltage Control,” inIEEE Electric Power and Energy Conferenced, 2020

  7. [7]

    Deep Reinforcement Learning for Long-Term V oltage Stability Control,

    H. Hagmar, L. A. Tuan, and R. Eriksson, “Deep Reinforcement Learning for Long-Term V oltage Stability Control,” arXiv: 2207.04240 [eess.SY], 2022

  8. [8]

    Design and Tests of Reinforcement-Learning-Based Optimal Power Flow Solution Generator,

    H. Zhen, H. Zhai, W. Ma, L. Zhao, Y . Weng, Y . Xu, J. Shi, and X. He, “Design and Tests of Reinforcement-Learning-Based Optimal Power Flow Solution Generator,”Energy Reports, 2022

  9. [9]

    R. S. Sutton and A. G. Barto,Reinforcement Learning: An Introduction, 2nd ed. The MIT Press, 2018

  10. [10]

    DeepOPF: A Feasibility- Optimized Deep Neural Network Approach for AC Optimal Power Flow Problems,

    X. Pan, M. Chen, T. Zhao, and S. H. Low, “DeepOPF: A Feasibility- Optimized Deep Neural Network Approach for AC Optimal Power Flow Problems,” arXiv: 2007.01002 [eess.SY], 2022

  11. [11]

    Tutorial on Amortized Optimization,

    B. Amos, “Tutorial on Amortized Optimization,” arXiv: 2202.00665 [cs.LG], 2022

  12. [12]

    Deep Statistical Solvers & Power Systems Applications,

    B. Donon, “Deep Statistical Solvers & Power Systems Applications,” Ph.D. dissertation, Universit ´e Paris-Saclay, 2022

  13. [13]

    Rte7000,

    RTE, “Rte7000,” 2021. [Online]. Available: https://huggingface.co/ datasets/OpenSynth/RTE700

  14. [14]

    The Graph Neural Network Model,

    F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The Graph Neural Network Model,”IEEE transactions on neural networks, 2008

  15. [15]

    Modeling Relational Data with Graph Convolutional Networks,

    M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, and M. Welling, “Modeling Relational Data with Graph Convolutional Networks,” inEuropean semantic web conference. Springer, 2018

  16. [16]

    A Review of Graph Neural Networks and Their Applications in Power Systems,

    W. Liao, B. Bak-Jensen, J. R. Pillai, Y . Wang, and Y . Wang, “A Review of Graph Neural Networks and Their Applications in Power Systems,” arXiv: 2101.10025 [cs.LG], 2021

  17. [17]

    Proximal Policy Optimization with Graph Neural Networks for Optimal Power Flow,

    ´Angela L ´opez-Cardona, G. Bern ´ardez, P. Barlet-Ros, and A. Cabellos- Aparicio, “Proximal Policy Optimization with Graph Neural Networks for Optimal Power Flow,” arXiv: 2212.12470 [cs.AI], 2025

  18. [18]

    Unsupervised Optimal Power Flow Using Graph Neural Networks,

    D. Owerko, F. Gama, and A. Ribeiro, “Unsupervised Optimal Power Flow Using Graph Neural Networks,” arXiv: 2210.09277 [eess.SY], 2022

  19. [19]

    Deep Reinforcement Learning for Optimal Power Flow with Renewables Using Graph Information,

    J. Li, R. Zhang, H. Wang, Z. Liu, H. Lai, and Y . Zhang, “Deep Reinforcement Learning for Optimal Power Flow with Renewables Using Graph Information,” arXiv: 2112.11461 [cs.LG], 2022

  20. [20]

    Warm-Starting AC Optimal Power Flow with Graph Neural Networks,

    F. Diehl, “Warm-Starting AC Optimal Power Flow with Graph Neural Networks,” inNeurIPS Workshop on Tackling Climate Change with Machine Learning, 2019

  21. [21]

    Topology-Aware Reinforce- ment Learning for Tertiary V oltage Control,

    B. Donon, F. Cub ´elier, E. Karangelos, L. Wehenkel, L. Crochepierre, C. Pache, L. Saludjian, and P. Panciatici, “Topology-Aware Reinforce- ment Learning for Tertiary V oltage Control,”Electric Power Systems Research, 2024

  22. [22]

    Neural Ordinary Differential Equations,

    T. Q. Chen, Y . Rubanova, J. Bettencourt, and D. Duvenaud, “Neural Ordinary Differential Equations,” inNeurIPS, 2018

  23. [23]

    PyPowSyBl, a Python API for PowSyBl Toolbox

    Committers of PyPowSyBl, “PyPowSyBl, a Python API for PowSyBl Toolbox.” [Online]. Available: http://github.com/powsybl/pypowsybl

  24. [24]

    JAX: composable transformations of Python+NumPy,

    J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclau- rin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang, “JAX: composable transformations of Python+NumPy,” 2018

  25. [25]

    Flax: A neural network library and ecosystem for JAX,

    J. Heek, A. Levskaya, A. Oliver, M. Ritter, B. Rondepierre, A. Steiner, and M. van Zee, “Flax: A neural network library and ecosystem for JAX,” 2024

  26. [26]

    On Neural Differential Equations,

    P. Kidger, “On Neural Differential Equations,” Ph.D. dissertation, Uni- versity of Oxford, 2021

  27. [27]

    Adam: A Method for Stochastic Optimization,

    D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” inInternational Conference for Learning Representations (ICLR), 2015

  28. [28]

    OpenLoadFlow, a Loadflow for PowSyBl Toolbox

    Committers of OpenLoadFlow, “OpenLoadFlow, a Loadflow for PowSyBl Toolbox.” [Online]. Available: https://github.com/powsybl/ powsybl-open-loadflow