Machine Learning for Exact Time Series Aggregation in Generation Expansion Planning with Energy Storage
Pith reviewed 2026-06-29 10:57 UTC · model grok-4.3
The pith
Machine learning estimates of marginal costs guide time series aggregation to preserve active constraints and achieve exact aggregation in generation expansion planning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By leveraging machine learning-based estimates of the GEP model marginal costs, the algorithm guides TSA to construct an aggregated model that preserves the active constraints of its full-scale counterpart, which has been shown to yield exact temporal aggregation.
What carries the argument
Iterative time series aggregation guided by machine learning estimates of marginal costs, used to identify and preserve active constraints from the full-scale model.
If this is right
- The aggregated model can be solved with substantially lower computational effort while still matching the binding constraints of the original model.
- An explicit optimality gap between the aggregated and full-scale solutions becomes computable at each iteration.
- Adding estimated marginal costs as features measurably improves clustering quality over methods that use only raw input time series.
- The approach applies directly to models that jointly optimize investment, operations, and market participation for systems containing storage.
Where Pith is reading between the lines
- The same constraint-preservation logic could be tested on other large-scale linear programs where marginal-cost signals indicate binding limits.
- If the ML predictor generalizes across similar GEP instances, repeated full-scale solves might be replaced by one-time training plus fast aggregated solves.
- Extending the method to nonlinear or stochastic variants would require verifying that the marginal-cost estimates still reliably flag active constraints.
Load-bearing premise
The machine learning estimates of marginal costs must be accurate enough to correctly identify which constraints are active in the full-scale model.
What would settle it
Solve both the full-scale GEP model and the ML-guided aggregated model on identical data instances; check whether the sets of active constraints and the investment/operational decisions match exactly.
Figures
read the original abstract
This paper investigates a generation expansion planning (GEP) problem encompassing renewable, thermal, and storage technologies while simultaneously optimizing market participation, operational expenditures, and capital investment. To alleviate the computational burden of the GEP model, we propose a novel iterative time series aggregation (TSA) method that constructs a temporally aggregated counterpart of the original full-scale GEP model. Unlike traditional TSA methods, which are purely heuristic, our method enables the assessment of the optimality gap between the aggregated and full-scale models. Moreover, by leveraging machine learning-based estimates of the GEP model marginal costs, the algorithm guides TSA to construct an aggregated model that preserves the active constraints of its full-scale counterpart, which has been shown to yield exact temporal aggregation. Numerical results show that incorporating estimated marginal costs as clustering features substantially improves the quality of temporal aggregation compared with traditional TSA methods that rely solely on input data analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a novel iterative time series aggregation (TSA) method for a generation expansion planning (GEP) problem that includes renewable, thermal, and storage technologies. The method uses machine learning estimates of marginal costs to guide the construction of an aggregated model that preserves the active constraints of the full-scale GEP model, which is asserted to yield exact temporal aggregation; numerical results are said to show improved aggregation quality over traditional TSA approaches that rely only on input data.
Significance. If the central claim holds, the work would offer a principled way to reduce the computational burden of large-scale GEP models while retaining exactness guarantees, which is valuable for long-term energy system planning. The explicit linkage of ML-derived marginal costs to active-constraint preservation and the provision of an optimality-gap assessment distinguish it from purely heuristic TSA methods.
major comments (2)
- [Abstract] Abstract: the central claim that ML-based marginal cost estimates enable preservation of active constraints (and thereby exact aggregation) is load-bearing, yet the manuscript supplies neither the ML architecture, training data source, error metrics on the marginal-cost predictions, nor any verification that the predicted active set matches the true active set obtained from the full-scale model.
- [Abstract] Abstract (and method description): the optimality-gap assessment is presented as independent of the ML step, but without reported checks on how misclassification of binding constraints propagates into the aggregated model, it is impossible to confirm that the exactness guarantee transfers from the referenced prior result on active-constraint preservation.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the presentation of the ML component and the transfer of the exactness guarantee. We address each major comment below, indicating planned revisions to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that ML-based marginal cost estimates enable preservation of active constraints (and thereby exact aggregation) is load-bearing, yet the manuscript supplies neither the ML architecture, training data source, error metrics on the marginal-cost predictions, nor any verification that the predicted active set matches the true active set obtained from the full-scale model.
Authors: We agree that additional details on the machine learning component are necessary to support the central claim. The revised manuscript will include: (i) the specific ML architecture (a feed-forward neural network with two hidden layers), (ii) the training data source (marginal costs obtained by solving the full-scale GEP model on a representative subset of scenarios), (iii) error metrics (MAE and classification accuracy on the binding-constraint predictions), and (iv) a verification procedure that compares the predicted active set against the true active set on held-out instances. These additions will be placed in a new subsection of the methods and referenced from the abstract. revision: yes
-
Referee: [Abstract] Abstract (and method description): the optimality-gap assessment is presented as independent of the ML step, but without reported checks on how misclassification of binding constraints propagates into the aggregated model, it is impossible to confirm that the exactness guarantee transfers from the referenced prior result on active-constraint preservation.
Authors: The referee correctly notes that the current text does not quantify the effect of ML prediction errors on the optimality-gap bound. While the gap assessment itself follows directly from the prior active-set preservation theorem when the active set is correctly identified, we will add an empirical robustness study in the numerical results section. This study will report the frequency and impact of misclassified binding constraints on the aggregated solution and the resulting gap estimate across the test cases. If the analysis shows material degradation, we will also discuss a fallback mechanism (e.g., iterative refinement of the ML predictions). revision: yes
Circularity Check
Exactness follows from active-constraint preservation (prior result); ML estimates are an input, not a definitional reduction
full rationale
The derivation chain is: ML marginal-cost estimates → identify active constraints → preserve them in aggregated model → exact TSA (by the cited prior result on active-constraint preservation). The prior result is invoked as an external fact rather than derived inside this paper; the ML step supplies an estimate but does not redefine or tautologically force the exactness guarantee. No equation equates a fitted quantity to its own prediction, no self-citation chain is load-bearing for the core claim, and the optimality-gap assessment is presented as an independent check. This yields only a minor self-citation risk at most.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
State-of-the-art generation expansion planning: A review,
N. E. Koltsaklis and A. S. Dagoumas, “State-of-the-art generation expansion planning: A review,”Appl. Energy, vol. 230, pp. 563-589, Nov. 2018
2018
-
[2]
An integrated planning model in centralized power systems,
F. L ´opez-Ramos, S. Nasini, and M. H. Sayed, “An integrated planning model in centralized power systems,”Eur . J. Oper . Res., vol. 287, no. 1, pp. 361-377, Nov. 2020
2020
-
[3]
Economic model predictive control for the energy management problem of a virtual power plant including resources at different voltage levels,
L. Santosuosso et al., “Economic model predictive control for the energy management problem of a virtual power plant including resources at different voltage levels,” in27th Int. Conf. Electricity Distrib. (CIRED 2023), Rome, Italy, 2023, pp. 2044–2048
2023
-
[4]
A comprehensive sequential review study through the generation expansion planning,
H. Sadeghi, M. Rashidinejad, and A. Abdollahi, “A comprehensive sequential review study through the generation expansion planning,” Renew. Sustain. Energy Rev., vol. 67, pp. 1369–1394, Jan. 2017
2017
-
[5]
Stochastic economic model predictive control for renewable energy and ancillary services trading with storage,
L. Santosuosso, S. Camal, F. Liberati, A. Di Giorgio, A. Michiorri, and G. Kariniotakis, “Stochastic economic model predictive control for renewable energy and ancillary services trading with storage,”Sustain. Energy, Grids Netw., vol. 38, pp. 101373, June 2024
2024
-
[6]
Mixed-integer linear programming models and algorithms for generation and transmission expansion planning of power systems,
C. Li, A. J. Conejo, P. Liu, B. P. Omell, J. D. Siirola, and I. E. Grossmann, “Mixed-integer linear programming models and algorithms for generation and transmission expansion planning of power systems,” Eur . J. Oper . Res., vol. 297, no. 3, pp. 1071-1082, Mar. 2022
2022
-
[7]
High temporal resolution generation expansion planning for the clean energy transition,
T. Levin, P. L. Blaisdell-Pijuan, J. Kwon, and W. N. Mann, “High temporal resolution generation expansion planning for the clean energy transition,”Renew. Sustain. Energy Transit., vol. 5, p. 100072, Aug. 2024
2024
-
[8]
A modeler’s guide to handle complexity in energy systems optimization,
L. Kotzur et al., “A modeler’s guide to handle complexity in energy systems optimization,”Adv. Appl. Energy, vol. 4, p. 100063, Nov. 2021
2021
-
[9]
A review on time series aggregation methods for energy system models,
M. Hoffmann, L. Kotzur, D. Stolten, and M. Robinius, “A review on time series aggregation methods for energy system models,”Energies, vol. 13, no. 3, p. 641, Feb. 2020
2020
-
[10]
Chronological time-period clustering for optimal capacity expansion planning with storage,
S. Pineda and J. M. Morales, “Chronological time-period clustering for optimal capacity expansion planning with storage,”IEEE Trans. Power Syst., vol. 33, no. 6, pp. 7162–7170, Nov. 2018
2018
-
[11]
Time series aggregation for energy system design: Modeling seasonal storage,
L. Kotzur, P. Markewitz, M. Robinius, and D. Stolten, “Time series aggregation for energy system design: Modeling seasonal storage,”Appl. Energy, vol. 213, pp. 123–135, Mar. 2018
2018
-
[12]
Clustering methods to find rep- resentative periods for the optimization of energy systems: an initial framework and comparison,
H. Teichgraeber and A. R. Brandt, “Clustering methods to find rep- resentative periods for the optimization of energy systems: an initial framework and comparison,”Appl. Energy, vol. 239, pp. 1283–1293, Apr. 2019
2019
-
[13]
Capturing chronology and extreme values of representative days for planning of transmission lines and long-term energy storage systems,
M. Moradi-Sepahvand and S. H. Tindemans, “Capturing chronology and extreme values of representative days for planning of transmission lines and long-term energy storage systems,” in Proc.2023 IEEE Belgrade PowerTech, Belgrade, Serbia, 2023
2023
-
[14]
Time-series aggregation for the optimization of energy systems: Goals, challenges, approaches, and opportunities,
H. Teichgraeber and A. R. Brandt, “Time-series aggregation for the optimization of energy systems: Goals, challenges, approaches, and opportunities,”Renew. Sustain. Energy Rev., vol. 157, p. 111984, Apr. 2022
2022
-
[15]
A model- adaptive clustering-based time aggregation method for low-carbon en- ergy system optimization,
Y . Zhang, V . Cheng, D. S. Mallapragada, J. Song, and G. He, “A model- adaptive clustering-based time aggregation method for low-carbon en- ergy system optimization,”IEEE Trans. Sustain. Energy, vol. 14, no. 1, pp. 55–64, Aug. 2023
2023
-
[16]
Data-driven representative day selection for investment decisions: A cost-oriented approach,
M. Sun, F. Teng, X. Zhang, G. Strbac, and D. Pudjianto, “Data-driven representative day selection for investment decisions: A cost-oriented approach,”IEEE Trans. Power Syst., vol. 34, no. 4, pp. 2925–2936, July 2019
2019
-
[17]
On represen- tative day selection for capacity expansion planning of power systems under extreme operating conditions,
C. Li, A. J. Conejo, J. D. Siirola, and I. E. Grossmann, “On represen- tative day selection for capacity expansion planning of power systems under extreme operating conditions,”Int. J. Electr . Power Energy Syst., vol. 137, p. 107697, May 2022
2022
-
[18]
Reducing climate risk in energy system planning: A posteriori time series aggregation for models with storage,
A. P. Hilbers, D. J. Brayshaw, and A. Gandy, “Reducing climate risk in energy system planning: A posteriori time series aggregation for models with storage,”Appl. Energy, vol. 334, p. 120624, Mar. 2023
2023
-
[19]
Time series aggregation for optimization: One-size-fits-all?,
S. Wogrin, “Time series aggregation for optimization: One-size-fits-all?,” IEEE Trans. Smart Grid, vol. 14, no. 3, pp. 2489–2492, Feb. 2023
2023
-
[20]
T. Klatzer, D. Cardona-Vasquez, L. Santosuosso, and S. Wogrin, “Towards exact temporal aggregation of time-coupled energy storage models via active constraint set identification and machine learning,” arXiv:2504.19699, Oct. 2025
-
[21]
A tutorial on kernel density estimation and recent ad- vances,
Y . C. Chen, “A tutorial on kernel density estimation and recent ad- vances,”Biostatistics Epidemiol., vol. 1, no. 1, pp. 161–187, Oct. 2017
2017
-
[22]
A review on random forest: An ensemble classifier,
A. Parmar, R. Katariya, and V . Patel, “A review on random forest: An ensemble classifier,” inInt. Conf. Intell. Data Commun. Technol. Internet Things (ICICI), Coimbatore, India, 2018, pp. 758–763
2018
-
[23]
Optimal virtual power plant investment planning via time series aggregation with bounded error,
L. Santosuosso and S. Wogrin, “Optimal virtual power plant investment planning via time series aggregation with bounded error,” in2025 IEEE PES Innov. Smart Grid Technol. Conf. Europe (ISGT Europe), Valletta, Malta, 2025, pp. 1–5
2025
-
[24]
The ENTSO-E trans- parency platform – A review of Europe’s most ambitious electricity data platform,
L. Hirth, J. M ¨uhlenpfordt, and M. Bulkeley, “The ENTSO-E trans- parency platform – A review of Europe’s most ambitious electricity data platform,”Appl. Energy, vol. 225, pp. 1054–1067, Sep. 2018. APPENDIXA FULL-SCALEMODEL The aggregated model presented in Subsection II-A col- lapses to the full-scale model when each representative time step corresponds ...
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.