A hierarchical RL policy paired with a runtime safety shield using forward simulation achieves longer survival, lower line loading, and zero-shot generalization on Grid2Op benchmarks including stress tests and unseen large grids.
Grid2op: A testbed for power grid control
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Hierarchical Reinforcement Learning with Runtime Safety Shielding for Power Grid Operation
A hierarchical RL policy paired with a runtime safety shield using forward simulation achieves longer survival, lower line loading, and zero-shot generalization on Grid2Op benchmarks including stress tests and unseen large grids.