Recognition: unknown
Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia's National Electricity Market
Pith reviewed 2026-05-08 06:19 UTC · model grok-4.3
The pith
Tree-based models outperform LSTM and SVR for electricity price forecasting in South Australia's volatile market, though all models exceed 90 percent mean absolute percentage error.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under a unified benchmark with identical lag features, rolling statistics, cyclic temporal encodings, and an 85/15 chronological split, tree-based models including GBRT outperform LSTM and SVR on price prediction with R-squared up to 0.88, while all models show mean absolute percentage error above 90 percent and over 65 percent of GBRT predictions carry relative errors above 10 percent; demand prediction reaches R-squared of 0.96 and mean absolute percentage error below 32 percent for AWMLSTM and GBRT, with 74.37 percent of GBRT samples inside 5 percent error.
What carries the argument
The unified benchmark framework that applies the same data preprocessing, feature engineering with lag features, rolling statistics, cyclic temporal encodings, and an 85/15 chronological train-test split across AWMLSTM, CatBoost, GBRT, LSTM, LightGBM, and SVR.
If this is right
- Tree-based models should be prioritized over LSTM and SVR for price forecasting in similar high-volatility electricity markets.
- Hybrid models combining trees with transformers may improve capture of extreme price events.
- Data augmentation for spikes and post-prediction error correction techniques could reduce the observed high relative errors.
- Demand forecasting benefits substantially more from the same features and models, achieving lower errors across the board.
Where Pith is reading between the lines
- The persistent high errors suggest that external signals such as real-time renewable generation or weather data may be needed beyond historical lags to explain remaining volatility.
- Applying the same benchmark to other NEM regions with varying renewable shares would reveal how well the tree-model advantage generalizes.
- Custom loss functions that penalize negative prices and spikes differently during training could address the imbalance the current setup leaves untouched.
Load-bearing premise
The chosen lag features, rolling statistics, cyclic encodings, and chronological split adequately handle non-stationarity, negative prices, and structural changes such as the shift to five-minute settlement without bias or missing future regime shifts.
What would settle it
Retraining and testing the same models on data split around the five-minute settlement transition date to check whether the performance ranking of tree-based models over LSTM and SVR remains stable or reverses.
read the original abstract
Short term electricity price forecast is essential in competitive power markets, yet electricity price series exhibit high volatility, irregularity, and non-stationarity. This phenomenon is pronounced in the South Australian region of the National Electricity Market, where high renewable penetration drives price volatility and frequent negative price intervals, while structural changes such as the transition to five-minute settlement further complicate forecast. To address these challenges, this study develops a unified benchmark framework. Under identical data preprocessing, feature engineering with lag features, rolling statistics, cyclic temporal encodings, and so on, and an 85% to 15% chronological train test split, six algorithms are systematically compared, including AWMLSTM, CatBoost, GBRT, LSTM, LightGBM, and SVR. The results show that for price prediction, tree-based models, especially GBRT with an R squared value of 0.88, generally outperform LSTM and SVR. However, all models achieve a mean absolute percentage error above 90%, and more than 65% of GBRT predictions have relative errors above 10%, which highlights the inherent difficulty of price forecast. For demand prediction, all models perform substantially better than in price prediction. AWMLSTM and GBRT achieve an R2 value of 0.96 with mean absolute percentage error below 32%, and GBRT has 74.37% of samples within 5% error, while LSTM and SVR perform less accurately in both tasks. Future improvements should focus on hybrid models such as tree plus transformers, data augmentation for extreme events, and error correction to better capture price spikes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a unified benchmark for short-term electricity price and demand forecasting in South Australia's NEM using six ML/DL models (AWMLSTM, CatBoost, GBRT, LSTM, LightGBM, SVR). With lag features, rolling statistics, cyclic encodings, and an 85/15 chronological split, it reports that tree-based models, particularly GBRT (R²=0.88 for price), outperform LSTM and SVR, but all models exhibit MAPE >90% for prices, underscoring forecasting difficulty; demand forecasting achieves higher accuracy (R²=0.96, MAPE<32%).
Significance. If the empirical comparisons hold under rigorous validation, this provides a valuable reference point for the challenges of price prediction in volatile, renewable-heavy markets with negative prices and market rule changes. The explicit reporting of poor MAPE and relative error distributions is a strength, as is the side-by-side evaluation of tree ensembles versus recurrent networks under identical preprocessing.
major comments (3)
- Methodology (data split and feature engineering): The single 85/15 chronological split and uniform application of lag/rolling/cyclic features across the entire series do not include regime indicators or pre/post-transition analysis for the five-minute settlement structural change highlighted in the abstract. This is load-bearing for the central claim of GBRT's R²=0.88 superiority, as tree models may overfit pre-shift patterns while post-shift volatility affects other models differently.
- Results and discussion: Although multiple metrics (R², MAPE, error distributions) are used, the manuscript lacks statistical significance tests (e.g., Diebold-Mariano) for model performance differences and details on hyperparameter optimization procedures, which are necessary to substantiate the outperformance claims given the high price volatility.
- Abstract and introduction: The handling of negative prices is mentioned as a challenge but not detailed in the preprocessing or model inputs; this could impact the MAPE calculations and relative error assessments for price forecasting.
minor comments (3)
- Abstract: The phrase 'and so on' in the feature description is imprecise; a complete list of engineered features should be provided for reproducibility.
- Results: The percentage of samples within 5% error for demand (74.37% for GBRT) is useful but should be accompanied by similar breakdowns for price predictions beyond the >10% relative error note.
- Conclusion: Suggestions for future work (hybrid models, data augmentation) are appropriate but could reference specific prior work on transformer-based time series or spike detection in electricity markets.
Simulated Author's Rebuttal
We thank the referee for their insightful and constructive comments, which have helped us improve the clarity and robustness of our manuscript. We address each major comment in detail below and indicate the revisions made to the manuscript.
read point-by-point responses
-
Referee: Methodology (data split and feature engineering): The single 85/15 chronological split and uniform application of lag/rolling/cyclic features across the entire series do not include regime indicators or pre/post-transition analysis for the five-minute settlement structural change highlighted in the abstract. This is load-bearing for the central claim of GBRT's R²=0.88 superiority, as tree models may overfit pre-shift patterns while post-shift volatility affects other models differently.
Authors: We agree that explicitly accounting for the five-minute settlement transition is important given its mention in the abstract. The chronological 85/15 split was employed to ensure temporal causality and prevent information leakage from future to past, which is a standard practice in time-series forecasting. Nevertheless, to address the referee's concern, we have introduced a binary regime indicator feature distinguishing pre- and post-transition periods in the revised feature engineering pipeline. Furthermore, we have added a supplementary analysis splitting the test set into pre- and post-transition subsets, where GBRT continues to demonstrate superior performance relative to the other models. This supports that the reported superiority is not solely due to overfitting pre-shift patterns. revision: yes
-
Referee: Results and discussion: Although multiple metrics (R², MAPE, error distributions) are used, the manuscript lacks statistical significance tests (e.g., Diebold-Mariano) for model performance differences and details on hyperparameter optimization procedures, which are necessary to substantiate the outperformance claims given the high price volatility.
Authors: We concur that rigorous statistical validation is essential, particularly in the presence of high volatility. In the revised manuscript, we have detailed the hyperparameter optimization procedure, which utilized a grid search over a predefined parameter space combined with rolling time-series cross-validation on the training set to select optimal hyperparameters for each model. Additionally, we have incorporated Diebold-Mariano tests to assess the statistical significance of performance differences between GBRT and the other models. The results, now presented in a new table, indicate that the improvements are statistically significant for the majority of comparisons. These additions bolster the credibility of our outperformance claims. revision: yes
-
Referee: Abstract and introduction: The handling of negative prices is mentioned as a challenge but not detailed in the preprocessing or model inputs; this could impact the MAPE calculations and relative error assessments for price forecasting.
Authors: We appreciate this observation. Negative prices are a key characteristic of the South Australian market and are preserved without any shifting or absolute transformation in the target variable to maintain their economic significance. The same applies to input features derived from prices. Regarding MAPE, we employ the conventional formula based on absolute percentage errors, which remains well-defined for negative values but can indeed be sensitive to near-zero prices. We have expanded the preprocessing subsection in the revised manuscript to explicitly describe this approach and discuss its implications for interpreting the high MAPE values observed. revision: yes
Circularity Check
Empirical benchmark with held-out chronological split shows no circularity
full rationale
The paper conducts a standard empirical comparison of six ML/DL models (GBRT, CatBoost, LightGBM, LSTM, AWMLSTM, SVR) for electricity price and demand forecasting. It applies fixed preprocessing, lag/rolling/cyclic features, and an 85/15 chronological train-test split, then reports direct test-set metrics (R²=0.88 for GBRT on price, MAPE>90% for all, etc.). No equations, derivations, fitted parameters renamed as predictions, self-citations for uniqueness theorems, or ansatzes exist; performance numbers are computed post-training on external held-out data and do not reduce to the inputs by construction. The derivation chain is self-contained experimental results.
Axiom & Free-Parameter Ledger
free parameters (1)
- Model hyperparameters (learning rates, tree depths, LSTM units, etc.)
axioms (2)
- domain assumption Chronological 85/15 split prevents leakage and simulates operational forecasting conditions
- domain assumption Lag features, rolling statistics, and cyclic encodings capture the relevant temporal structure
Reference graph
Works this paper leans on
-
[1]
Kirschen, D.S. and G. Strbac, Fundamentals of power system economics. 2018: John Wiley & Sons
2018
-
[2]
Tan, and O
Parker, G.G., B. Tan, and O. Kazan, Electric power industry: Operational and public policy challenges and opportunities. Production and Operations Management, 2019. 28(11): p. 2738-2777
2019
-
[3]
International journal of forecasting, 2014
Weron, R., Electricity price forecasting: A review of the state -of-the-art with a look into the future. International journal of forecasting, 2014. 30(4): p. 1030-1081
2014
-
[4]
IEEE Open Access Journal of Power and Energy, 2020
Hong, T., et al., Energy forecasting: A review and outlook. IEEE Open Access Journal of Power and Energy, 2020. 7: p. 376-388
2020
-
[5]
Nowotarski, J. and R. Weron, Recent advances in electricity price forecasting: A review of probabilistic forecasting. Renewable and Sustainable Energy Reviews, 2018. 81: p. 1548-1568
2018
-
[6]
IEEE transactions on power systems, 2003
Contreras, J., et al., ARIMA models to predict next -day electricity prices. IEEE transactions on power systems, 2003. 18(3): p. 1014-1020
2003
-
[7]
De Ridd er, and B
Lago, J., F. De Ridd er, and B. De Schutter, Forecasting spot electricity prices: Deep learning approaches and empirical comparison of traditional algorithms. Applied Energy, 2018. 221: p. 386-405
2018
-
[8]
Applied Energy, 2021
Lago, J., et al., Forecasting day-ahead electricity prices: A review of stat e-of-the-art algorithms, best practices and an open-access benchmark. Applied Energy, 2021. 293: p. 116983
2021
-
[9]
Applied Energy, 2019
Brusaferri, A., et al., Bayesian deep learning based method for probabilistic forecast of day-ahead electricity prices. Applied Energy, 2019. 250: p. 1158-1175
2019
-
[10]
Torgo, and I
Cerqueira, V., L. Torgo, and I. Mozetič, Evaluating time series forecasting models: An empirical study on performance estimation methods. Machine Learning, 2020. 109(11): p. 1997-2028
2020
-
[11]
Rai, A. and O. Nunn, On the impact of increasing penetration of variable renewables on electricity spot price extremes in Australia. Economic analysis and policy, 2020. 67: p. 67-86
2020
-
[12]
Yan, G. and L. Han, The impact of rooftop solar on wholesale electricity demand in the Australian National Electricity Market. Frontiers in Energy Research, 2023. 11: p. 1197504
2023
-
[13]
Energy Policy, 2011
Cutler, N.J., et al., High penetration wind generation impacts on spot prices in the Australian national electricity market. Energy Policy, 2011. 39(10): p. 5939-5949
2011
-
[14]
Forrest, S. and I. MacGill, Assessing the impact of wind generation on wholesale prices and generator dispatch in the Australian National Electricity Market. Energy policy,
-
[15]
Dinh, and S.A
Cornell, C., N.T. Dinh, and S.A. Pourmousavi, A probabilistic forecast methodology for volatile electricity prices in the Australian National Electricity Market. International Journal of Forecasting, 2024. 40(4): p. 1421-1437
2024
-
[16]
Qu, and T
Csereklyei, Z., S. Qu, and T. Ancev, The effect of wind and solar power generation on wholesale electricity prices in Australia. Energy Policy, 2019. 131: p. 358-369
2019
-
[17]
Nikitopoulos, and A
Mwampashi, M.M., C.S. Nikitopoulos, and A. Rai, From 30-to 5-minute settlement rule in the NEM: An early evaluation. Energy Policy, 2024. 194: p. 114305
2024
-
[18]
Gonçalves, R. and F. Menezes, The price impacts of the exit of the Hazelwood coal power plant. Energy Economics, 2022. 116: p. 106398
2022
-
[19]
Energies, 2025
O’Connor, C., et al., A review of electricity price forecasting models in the day-ahead, intra-day, and balancing markets. Energies, 2025. 18(12): p. 3097
2025
-
[20]
Spiliotis, and V
Makridakis, S., E. Spiliotis, and V. Assimakopoulos, Statistical and Machine Learning forecasting methods: Concerns and ways forward. PloS one, 2018. 13(3): p. e0194889
2018
-
[21]
Weron, and F
Uniejewski, B., R. Weron, and F. Ziel, Variance stabilizing transformations for electricity spot price forecasting. IEEE Transactions on Power Systems, 2017. 33(2): p. 2219-2229
2017
-
[22]
Smola, A.J. and B. Schölkopf, A tutorial on support vector regression. Statistics and computing, 2004. 14(3): p. 199-222
2004
-
[23]
Annals of statistics, 2001: p
Friedman, J.H., Greedy function approximation: a gradient boosting machine. Annals of statistics, 2001: p. 1189-1232
2001
-
[24]
Advances in neural information processing systems, 2017
Ke, G., et al., Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 2017. 30
2017
-
[25]
Advances in neural information processing systems, 2018
Prokhorenkova, L., et al., CatBoost: unbiased boosting with categorical features. Advances in neural information processing systems, 2018. 31
2018
-
[26]
Hochreiter, S. and J. Schmidhuber, Long short -term memory. Neural computation,
-
[27]
IEEE Transactions on Industrial Electronics,
Yuan, X., et al., Deep learning with spatiotemporal attention -based LSTM for industrial soft sensor model development. IEEE Transactions on Industrial Electronics,
-
[28]
Applied Energy,
Ghimire, S., et al., Two-step deep learning fr amework with error compensation technique for short -term, half -hourly electricity price forecasting. Applied Energy,
-
[29]
Sustainability, 2022
Wang, D., et al., Electricity Price Instability over Time: Time Series Analysis and Forecasting. Sustainability, 2022. 14(15): p. 9081
2022
-
[30]
Electrical Engineering, 2024
Abroun, M., et al., Predicting long -term electricity prices using modified support vector regression method. Electrical Engineering, 2024. 106(4): p. 4103-4114
2024
-
[31]
Scientific Reports, 2025
Hu, J., et al., A data driven model based approach for medium-to-long-term electricity price forecasting in power markets. Scientific Reports, 2025. 15(1): p. 37046
2025
-
[32]
Schlüter, and L
Das, A., S. Schlüter, and L. Schneider, Electricity Price Prediction Using Multikernel Gaussian Process Regression Combined With Kernel -Based Support Vector Regression. Journal of Forecasting, 2026
2026
-
[33]
Kuşkaya, S. and F. Bilgili, Forecasting electricity price index with machine learning models and strategies. Quality & Quantity, 2026. 60(1): p. 2651-2678
2026
-
[34]
Nasios, I. and K. Vogklis, Blending gradient boosted trees and neural networks for point and probabilistic forecasting of hierarchical time series. International Journal of Forecasting, 2022. 38(4): p. 1448-1459
2022
-
[35]
Oprea, and A.-C
Bâra, A., S.-V. Oprea, and A.-C. Băroiu, Forecasting the Spot Market Electricity Price with a Long Short -Term Memory Model Architecture in a Disruptive Economic and Geopolitical Context. International Journal of Computational Intelligence Systems,
-
[36]
Energies, 2025
Zi, X., et al., A Deep Learning Method for Photovoltaic Power Generation Forecasting Based on a Time-Series Dense Encoder. Energies, 2025. 18(10): p. 2434
2025
-
[37]
Journal of Electrical Engineering & Technology,
Yang, G., et al., Short-term Price Forecasting Method in Electricity Spot Markets Based on Attention -LSTM-mTCN. Journal of Electrical Engineering & Technology,
-
[38]
Energy Economics, 2023
Marcjasz, G., et al., Distributional neural networks for electricity price forecasting. Energy Economics, 2023. 125: p. 106843
2023
-
[39]
Berrisch, J. and F. Ziel, Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. International Journal of Forecasting, 2024. 40(4): p. 1568-1586
2024
-
[40]
A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction
Qin, Y., et al., A dual-stage attention-based recurrent neural network for time series prediction. arXiv preprint arXiv:1704.02971, 2017
work page Pith review arXiv 2017
-
[41]
Journal of Environmental Chemical Engineering, 2025
Khoshvaght, H., et al., A critical review on selecting performance evaluation metrics for supervised machine learning models in wastewater quality prediction. Journal of Environmental Chemical Engineering, 2025. 13(6): p. 119675
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.