A Generalized Synthetic Control Method for Baseline Estimation in Demand Response Services
Pith reviewed 2026-05-10 04:38 UTC · model grok-4.3
The pith
A generalized synthetic control method transforms baseline estimation in demand response into a dynamic counterfactual prediction problem.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We develop a generalized SCM framework that transforms baseline estimation into a dynamic counterfactual prediction problem by augmenting the donor representation with exogenous features, lagged treated load, and selected lagged donor signals. This enriched representation allows the estimator to capture autoregressive dependence, delayed donor-response patterns, and error-correction effects beyond the scope of standard SCM. The framework further accommodates nonlinear predictors when linear weighting is inadequate, with the greatest benefit arising in limited-data settings. Experiments on the Ausgrid smart-meter dataset show consistent improvements over classical SCM and strong benchmark.
What carries the argument
The augmented donor representation in the generalized synthetic control method, which incorporates lagged treated load and donor signals to model temporal dynamics in load data.
If this is right
- The estimator captures autoregressive dependence in load patterns.
- It accounts for delayed donor-response patterns and error-correction effects.
- Performance gains are largest in settings with limited data.
- Nonlinear predictors can be used when linear weighting is insufficient.
- Consistent outperformance on real-world smart meter data from Ausgrid.
Where Pith is reading between the lines
- If the dynamic augmentation generalizes, it could improve counterfactual estimation in other time-series applications like policy evaluation.
- The method might be combined with modern machine learning techniques to further enhance predictive accuracy in energy systems.
- Testing on datasets from different regions could reveal how well the lagged signal selection holds across varying grid conditions.
Load-bearing premise
The assumption that adding lagged treated load and selected lagged donor signals captures the relevant temporal structure without introducing bias or overfitting while keeping the donor pool valid.
What would settle it
Observing that the generalized method fails to improve prediction error compared to classical SCM on a new held-out set of smart meter data or when the lagged terms cause overfitting in cross-validation.
Figures
read the original abstract
Baseline estimation is critical to Demand Response (DR) settlement in electricity markets, yet existing machine learning methods remain limited in predictive performance, while methodologies from causal inference and counterfactual prediction are still underutilized in this domain. We introduce a Generalized Synthetic Control Method that builds on the classical Synthetic Control Method (SCM) from econometrics. While SCM provides a powerful framework for counterfactual estimation, classical SCM remains a static estimator: it fits the treated unit as a combination of contemporaneous donor units and therefore ignores predictable temporal structure in the residual error. We develop a generalized SCM framework that transforms baseline estimation into a dynamic counterfactual prediction problem by augmenting the donor representation with exogenous features, lagged treated load, and selected lagged donor signals. This enriched representation allows the estimator to capture autoregressive dependence, delayed donor-response patterns, and error-correction effects beyond the scope of standard SCM. The framework further accommodates nonlinear predictors when linear weighting is inadequate, with the greatest benefit arising in limited-data settings. Experiments on the Ausgrid smart-meter dataset show consistent improvements over classical SCM and strong benchmark methods, with the dominant performance gains driven by dynamic augmentation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a Generalized Synthetic Control Method (GSCM) for baseline estimation in demand response (DR) services. It extends classical SCM by augmenting the donor representation with exogenous features, lagged treated load, and lagged donor signals to model dynamic temporal structure (autoregressive effects, delayed responses, error correction). The framework also permits nonlinear predictors when linear weighting is insufficient. Experiments on the Ausgrid smart-meter dataset report consistent improvements over classical SCM and benchmark methods, with gains attributed primarily to the dynamic augmentation, especially in limited-data regimes.
Significance. If the dynamic augmentation can be shown to preserve unbiased counterfactual estimation, the work would usefully bridge causal inference methods with practical DR settlement needs in electricity markets. The focus on limited-data performance and explicit comparison to SCM is a strength; reproducible code or parameter-free derivations would further increase its value.
major comments (2)
- [§4] §4 (Dynamic SCM formulation): the central claim that augmenting with lagged treated load yields an unbiased counterfactual requires explicit specification of how these lags are constructed during the post-treatment window. If realized (treated) load values are inserted directly, the estimator risks learning from post-treatment spillovers (load shifting or anticipation), violating the classical SCM pre-treatment-only fitting principle. A concrete procedure (recursive prediction, synthetic lag substitution, or pre-treatment-only restriction) must be stated and validated.
- [§5] §5 (Experiments on Ausgrid): the reported performance gains lack error bars, statistical significance tests, or ablation isolating the contribution of each augmentation component (exogenous features vs. lagged treated load vs. lagged donors). Without these, it is impossible to confirm that the dynamic terms drive the improvement rather than overfitting or dataset-specific artifacts.
minor comments (2)
- [Abstract, §3] The abstract and §3 would benefit from one or two key equations showing the augmented optimization objective and the prediction step, rather than purely verbal description.
- [§4] Notation for the donor pool and lag selection procedure should be defined consistently before use in the experimental tables.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. These have helped us strengthen the methodological clarity and experimental rigor of the paper. We address each major comment below and have revised the manuscript accordingly.
read point-by-point responses
-
Referee: [§4] §4 (Dynamic SCM formulation): the central claim that augmenting with lagged treated load yields an unbiased counterfactual requires explicit specification of how these lags are constructed during the post-treatment window. If realized (treated) load values are inserted directly, the estimator risks learning from post-treatment spillovers (load shifting or anticipation), violating the classical SCM pre-treatment-only fitting principle. A concrete procedure (recursive prediction, synthetic lag substitution, or pre-treatment-only restriction) must be stated and validated.
Authors: We agree that an explicit description of lag construction in the post-treatment window is essential to preserve the unbiasedness property of the counterfactual estimator. In the revised §4 we now specify that lagged treated load enters the model only during the pre-treatment fitting stage. For post-treatment counterfactual generation we employ recursive prediction: at each future time step t, the lagged treated-load input is replaced by the model's own predicted counterfactual value from step t-1. This is formalized with updated equations and pseudocode in the revised section. We have also added a small simulation study confirming that the recursive procedure introduces no detectable bias relative to the classical SCM under the standard no-spillover assumption. revision: yes
-
Referee: [§5] §5 (Experiments on Ausgrid): the reported performance gains lack error bars, statistical significance tests, or ablation isolating the contribution of each augmentation component (exogenous features vs. lagged treated load vs. lagged donors). Without these, it is impossible to confirm that the dynamic terms drive the improvement rather than overfitting or dataset-specific artifacts.
Authors: We accept that the original experimental reporting was insufficiently rigorous. In the revised §5 and a new appendix we now report mean performance with standard-error bars computed over 10 random seeds and 5-fold cross-validation. We include paired t-tests (with p-values) comparing GSCM against each baseline. We further present a full ablation table that isolates the marginal contribution of (i) exogenous features, (ii) lagged treated load, and (iii) lagged donor signals. The ablation confirms that the lagged-treated-load term accounts for the largest share of the observed gains, especially in the limited-donor-data regime, while the other components provide smaller but complementary improvements. The revised code repository has also been updated to reproduce all new tables and figures. revision: yes
Circularity Check
No significant circularity in the generalized SCM framework
full rationale
The paper proposes extending classical SCM by augmenting donor representations with exogenous features, lagged treated load, and lagged donor signals to create a dynamic counterfactual estimator. This is a direct methodological extension rather than any redefinition of the baseline quantity in terms of itself or a fitted parameter renamed as a prediction. No equations are shown that reduce the output to the inputs by construction, no uniqueness theorems are imported from self-citations, and no ansatz is smuggled via prior work. The derivation chain is self-contained as an independent augmentation of an existing causal inference tool, with performance claims supported by experiments on external data rather than tautological fits.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Baseline estimation and scheduling for demand response,
D. Muthirayan, D. Kalathil, K. Poolla, and P. Varaiya, “Baseline estimation and scheduling for demand response,” in 2018 IEEE Conference on Decision and Control (CDC), 2018, pp. 4857–4862
2018
-
[2]
A two-stage mechanism for demand response markets,
B. Satchidanandan, M. Roozbehani, and M. A. Dahleh, “A two-stage mechanism for demand response markets,”IEEE Control Systems Letters, vol. 7, pp. 49–54, 2023
2023
-
[3]
Customer incentives for gaming demand response baselines,
D. Ellman and Y. Xiao, “Customer incentives for gaming demand response baselines,” in2019 IEEE 58th Conference on Decision and Control (CDC), Dec. 2019, pp. 5174–5179, iSSN: 2576-2370
2019
-
[4]
Methodologies for customer baseline load estimation and their implications,
S. Pati, S. J. Ranade, and O. Lavrova, “Methodologies for customer baseline load estimation and their implications,” in 2020 IEEE Texas Power and Energy Conference (TPEC), Feb. 2020, pp. 1–5
2020
-
[5]
Assessing the impact of employing machine learning-based baseline load prediction pipelines with sliding-window training scheme,
I. A. Campodonico Avendano, F. Dadras Javan, B. Najafi, A. Moazami, and F. Rinaldi, “Assessing the impact of employing machine learning-based baseline load prediction pipelines with sliding-window training scheme,”Energy and Buildings, vol. 294, p. 113217, Sep. 2023
2023
-
[6]
A cbl estimation method for industrial customers based on k-means and lstm,
W. Zhang, S. Lei, M. Liu, R. Wei, and Y. Wang, “A cbl estimation method for industrial customers based on k-means and lstm,” in2024 China International Conference on Electricity Distribution (CICED), Sep. 2024, pp. 83–87, iSSN: 2161-749X
2024
-
[7]
Residential Customer Baseline Load Estimation Using Stacked Autoencoder With Pseudo-Load Selection,
X. Wang, Y. Wang, J. Wang, and D. Shi, “Residential Customer Baseline Load Estimation Using Stacked Autoencoder With Pseudo-Load Selection,”IEEE Journal on Selected Areas in Communications, vol. 38, no. 1, pp. 61–70, Jan. 2020
2020
-
[8]
Customer baseline load estimation for virtual power plants in demand response: An attention mechanism-based generative adversarial networks approach,
Z. Wang and H. Zhang, “Customer baseline load estimation for virtual power plants in demand response: An attention mechanism-based generative adversarial networks approach,”Applied Energy, vol. 357, p. 122544, Mar. 2024
2024
-
[9]
Building a better baseline for residential demand response programs: Mitigating the effects of customer heterogeneity and random variations,
P. Schwarz, S. Mohajeryami, and V. Cecchi, “Building a better baseline for residential demand response programs: Mitigating the effects of customer heterogeneity and random variations,”Electronics, vol. 9, no. 4, p. 570, Apr. 2020
2020
-
[10]
The economic costs of conflict: A case study of the basque country,
A. Abadie and J. Gardeazabal, “The economic costs of conflict: A case study of the basque country,”American Economic Review, vol. 93, no. 1, pp. 113–132, Mar. 2003
2003
-
[11]
Synthetic control methods for comparative case studies: Estimating the effect of california’s tobacco control program,
A. Abadie, A. Diamond, and J. Hainmueller, “Synthetic control methods for comparative case studies: Estimating the effect of california’s tobacco control program,”Journal of the American Statistical Association, vol. 105, no. 490, pp. 493–505, Jun. 2010
2010
-
[12]
Using synthetic controls: Feasibility, data requirements, and methodological aspects,
A. Abadie, “Using synthetic controls: Feasibility, data requirements, and methodological aspects,”Journal of Economic Literature, vol. 59, no. 2, pp. 391–425, Jun. 2021
2021
-
[13]
An aggregated baseline load estimation method based on graph convolutional networks introducing graph structure learning,
X. Peng, F. Wang, X. Ge, and Y. Wang, “An aggregated baseline load estimation method based on graph convolutional networks introducing graph structure learning,” in2024 IEEE/IAS 60th Industrial and Commercial Power Systems Technical Conference (I&CPS), May 2024, pp. 1–7, iSSN: 2158-4907
2024
-
[14]
C. Qian, D. Xu, Y. Zhang, J. Bao, X. Ma, and Z. Wu, “Residential Customer Baseline Load Estimation Based on Conditional Denoising Diffusion Probabilistic Model,” in2024 IEEE 4th International Conference in Power Engineering Applications (ICPEA), Mar. 2024, pp. 59–63. [Online]. Available: https://ieeexplore.ieee.org/document/10498265
-
[15]
A Robust Segmented Mixed Effect Regression Model for Baseline Electricity Consumption Forecasting,
X. Zhou, Y. Gao, W. Yao, and N. Yu, “A Robust Segmented Mixed Effect Regression Model for Baseline Electricity Consumption Forecasting,”Journal of Modern Power Systems and Clean Energy, vol. 10, no. 1, pp. 71–80, Jan. 2022, conference Name: Journal of Modern Power Systems and Clean Energy. [Online]. Available: https://ieeexplore.ieee.org/document/9248496
-
[16]
Spatio-Temporal Two-Dimensions Data Based Customer Baseline Load Estimation Approach Using LASSO Regression,
X. Ge, F. Xu, Y. Wang, H. Li, F. Wang, J. Hu, K. Li, X. Lu, and B. Chen, “Spatio-Temporal Two-Dimensions Data Based Customer Baseline Load Estimation Approach Using LASSO Regression,”IEEE Transactions on Industry Applications, vol. 58, no. 3, pp. 3112–3122, May 2022, conference Name: IEEE Transactions on Industry Applications
2022
-
[17]
Customers baseline load estimation based on cluster analysis of control group,
L. Ying, F. Ma, X. Cui, and X. Xie, “Customers baseline load estimation based on cluster analysis of control group,” in 2023 IEEE 6th International Electrical and Energy Conference (CIEEC), May 2023, pp. 1563–1568. [Online]. Available: https://ieeexplore.ieee.org/document/10166717
-
[18]
Residential load and rooftop pv generation: an australian distribution network dataset,
E. L. Ratnam, S. R. Weller, C. M. Kellett, and A. T. Murray, “Residential load and rooftop pv generation: an australian distribution network dataset,”International Journal of Sustainable Energy, vol. 36, no. 8, pp. 787–806, Sep. 2017
2017
-
[19]
Short-term electrical load forecasting using the support vector regression (svr) model to calculate the demand response baseline for office buildings,
Y. Chen, P. Xu, Y. Chu, W. Li, Y. Wu, L. Ni, Y. Bao, and K. Wang, “Short-term electrical load forecasting using the support vector regression (svr) model to calculate the demand response baseline for office buildings,”Applied Energy, vol. 195, pp. 659–670, Jun. 2017. 15
2017
-
[20]
Sydney airport weather & climate data
Meteostat, “Sydney airport weather & climate data.” [Online]. Available: https://meteostat.net/de/station/94767?t=202 5-02-26/2025-03-05
2025
-
[21]
A generalized scm for baseline estimation in demand response services,
J. Sievers, “A generalized scm for baseline estimation in demand response services,” Sep. 2025. [Online]. Available: https://github.com/JonasSievers/A-Generalized-Synthetic-Control-Method-for-Baseline-Estimation-in-Demand-Respo nse-Services
2025
-
[22]
Aggregated price and demand data
AEMO, “Aggregated price and demand data.” [Online]. Available: https://aemo.com.au/energy-systems/electricity/natio nal-electricity-market-nem/data-nem/aggregated-data 16
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.