EnergyMamba: An Uncertainty-Aware Graph-Enhanced Selective State Space Model for Energy Consumption Prediction
Pith reviewed 2026-06-28 19:08 UTC · model grok-4.3
The pith
Injecting grid topology into a selective state space model and pairing it with adaptive conformal quantile regression improves energy consumption forecasts and their uncertainty intervals.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
EnergyMamba shows that a Graph-Enhanced Selective State Space Model, which learns spatial context from grid topology and injects it into temporal dynamics, combined with an Adaptive Sequential Conformalized Quantile Regression module that uses locally adaptive normalization and online feedback, produces more accurate point predictions and better-calibrated uncertainty intervals under distribution shifts than prior approaches.
What carries the argument
Graph-Enhanced Selective State Space Model (GE-Mamba) that injects spatial context learned from the grid topology into the temporal dynamics of a selective state space model.
If this is right
- Coupled spatiotemporal modeling yields roughly 5 percent higher prediction accuracy than time-series-only baselines.
- The adaptive conformal module delivers roughly 6 percent better uncertainty quantification across distribution shifts.
- Online feedback in the quantile regression module allows dynamic interval adjustment without retraining.
- The framework applies to four large-scale datasets spanning Florida, New York, and California.
Where Pith is reading between the lines
- The same graph-injection pattern could transfer to other spatiotemporal forecasting problems such as traffic flow or building load where topology is known.
- The online feedback loop might reduce the need for periodic full retraining when new data arrives continuously.
- If the spatial component is removed, performance should fall back to that of a plain selective state space model on the same data.
Load-bearing premise
The grid topology supplies spatial dependencies that meaningfully improve the temporal modeling inside the state space model, and the adaptive conformal module can recalibrate intervals without introducing new calibration failures.
What would settle it
Run the same four datasets but replace the real grid topology with a random graph of the same size and measure whether the accuracy and uncertainty gains disappear.
Figures
read the original abstract
Energy consumption prediction is essential for efficient grid management, demand-side optimization, and sustainable energy planning. Although advanced machine learning methods have been employed for better prediction performance, existing works have two key limitations: (1) they usually formulate this task as a purely time-series prediction problem without explicitly modeling the spatial dependencies among different regions, and (2) they fail to provide reliable predictions with uncertainty estimates under abnormal situations such as extreme weather events. To advance existing research, we propose EnergyMamba, an uncertainty-aware spatiotemporal learning framework for accurate and reliable energy consumption prediction, which comprises two key components: (i) a novel Graph-Enhanced Selective State Space Model (GE-Mamba) that injects spatial context learned from the grid topology into the temporal dynamics, enabling coupled spatiotemporal modeling, and (ii) an Adaptive Sequential Conformalized Quantile Regression (AS-CQR) module, which includes locally adaptive normalization and an online feedback mechanism to dynamically calibrate prediction intervals under potential distribution shifts. We evaluate EnergyMamba on four large-scale real-world datasets from Florida, New York, and California. Results show EnergyMamba achieves around 5% improvement in prediction accuracy and 6% improvement in uncertainty quantification over 15 state-of-the-art baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes EnergyMamba, an uncertainty-aware spatiotemporal framework for energy consumption prediction. It consists of a Graph-Enhanced Selective State Space Model (GE-Mamba) that injects spatial context learned from the grid topology into the temporal dynamics of a selective state space model, and an Adaptive Sequential Conformalized Quantile Regression (AS-CQR) module with locally adaptive normalization and an online feedback mechanism to calibrate prediction intervals under distribution shifts. The framework is evaluated on four large-scale real-world datasets from Florida, New York, and California, where it reportedly achieves around 5% improvement in prediction accuracy and 6% improvement in uncertainty quantification over 15 state-of-the-art baselines.
Significance. If the empirical results hold under rigorous validation, the work would advance spatiotemporal modeling for energy systems by coupling graph-based spatial dependencies with selective state space models and by providing adaptive uncertainty estimates suitable for extreme events. The focus on real-world large-scale datasets from multiple regions supports potential applicability to grid management and demand optimization.
major comments (1)
- Abstract: the abstract states empirical improvements but supplies no information on experimental protocol, baseline selection criteria, statistical testing, error bars, data splits, or handling of potential confounds such as temporal leakage; without these details it is impossible to verify the central claim of ~5% accuracy and ~6% uncertainty improvements over 15 baselines.
minor comments (1)
- The description of how the grid topology is converted into a graph and the precise injection mechanism into the selective state space model would benefit from additional clarification to assess whether spatial dependencies are meaningfully captured.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. We agree that the abstract should be revised to provide sufficient context on the experimental protocol so that the reported improvements can be more readily assessed. Below we respond point-by-point to the major comment.
read point-by-point responses
-
Referee: [—] Abstract: the abstract states empirical improvements but supplies no information on experimental protocol, baseline selection criteria, statistical testing, error bars, data splits, or handling of potential confounds such as temporal leakage; without these details it is impossible to verify the central claim of ~5% accuracy and ~6% uncertainty improvements over 15 baselines.
Authors: We acknowledge that the current abstract is concise and does not enumerate these methodological details. The full experimental protocol is described in Section 4: we employ chronological train/validation/test splits on each of the four datasets to avoid temporal leakage; the 15 baselines were selected to cover recent state-of-the-art methods across time-series, graph, and uncertainty-aware categories; statistical significance is assessed with paired t-tests (p < 0.05) and results are reported with standard deviations over five independent runs; the AS-CQR module explicitly addresses distribution shifts via online feedback. To address the referee’s concern, we will revise the abstract to include a brief statement on the evaluation protocol, the use of temporal splits, and the statistical testing performed. revision: yes
Circularity Check
No significant circularity
full rationale
The paper's central claims consist of empirical performance gains (approximately 5% accuracy and 6% uncertainty quantification) measured against 15 external state-of-the-art baselines on four independent real-world datasets. The abstract describes two architectural components (GE-Mamba and AS-CQR) whose value is asserted via these external comparisons rather than any internal derivation that reduces a reported metric to a fitted parameter or self-referential definition. No equations, self-citations, or ansatzes are supplied that would allow a load-bearing step to collapse by construction to the inputs. This is the normal case of an empirical ML paper whose results remain falsifiable against held-out data and independent baselines.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C Maddix, Syama Rangapu- ram, David Salinas, Jasper Schulz, et al. 2020. Gluonts: Probabilistic and neural time series modeling in python.Journal of Machine Learning Research21, 116 (2020), 1–6
2020
-
[2]
Taghreed Alghamdi, Khalid Elgazzar, Magdi Bayoumi, Taysseer Sharaf, and Sumit Shah. 2019. Forecasting Traffic Congestion Using ARIMA Modeling. In2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC). 1227–1232. doi:10.1109/IWCMC.2019.8766698
-
[3]
Lei Bai, Lina Yao, Can Li, Xianzhi Wang, and Can Wang. 2020. Adaptive graph convolutional recurrent network for traffic forecasting.Advances in neural information processing systems33 (2020), 17804–17815
2020
-
[4]
Adel Binbusayyis and Mohemmed Sha. 2025. Energy consumption prediction using modified deep CNN-Bi LSTM with attention mechanism.Heliyon11, 1 (2025)
2025
-
[5]
Salah Bouktif, Ali Fiaz, Ali Ouni, and Mohamed Adel Serhani. 2020. Multi- sequence LSTM-RNN deep learning and metaheuristics for electric load forecast- ing.Energies13, 2 (2020), 391
2020
-
[6]
William Gouvêa Buratto, Rafael Ninno Muniz, Ademir Nied, and Gabriel Villarru- bia Gonzalez. 2024. Seq2Seq-LSTM with attention for electricity load forecasting in Brazil.IEEE Access(2024)
2024
-
[7]
Xueqi Cheng, Catherine Yang, Yuying Zhao, Yu Wang, Hamid Karimi, and Tyler Derr. 2025. BTS: A Comprehensive Benchmark for Tie Strength Prediction. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 5345–5354
2025
-
[8]
Maojiang Deng, Shoufeng Lu, Jiazhao Shi, and Wen Zhang. 2026. Adaptive traffic signal control optimization using a novel road partition and multi-channel state representation method.Urban Lifeline4, 1 (2026), 9
2026
-
[9]
Isaac Gibbs and Emmanuel Candes. 2021. Adaptive conformal inference un- der distribution shift.Advances in Neural Information Processing Systems34 (2021), 1660–1672. https://proceedings.neurips.cc/paper_files/paper/2021/file/ 0d441de75945e5acbc865406fc9a2559-Paper.pdf
2021
-
[10]
Albert Gu and Tri Dao. 2024. Mamba: Linear-time sequence modeling with selective state spaces. InFirst conference on language modeling. https://arxiv.org/ abs/2312.00752
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[11]
Shengnan Guo, Youfang Lin, Ning Feng, Chao Song, and Huaiyu Wan. 2019. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. InProceedings of the AAAI conference on artificial intelligence, Vol. 33. 922–929
2019
-
[12]
Tao Hong, Pu Wang, and H Lee Willis. 2011. A naïve multiple linear regression benchmark for short term load forecasting. In2011 IEEE power and energy society general meeting. IEEE, 1–6
2011
-
[13]
Kexin Huang, Ying Jin, Emmanuel Candes, and Jure Leskovec. 2024. Uncertainty quantification over graph with conformalized graph neural networks.Advances in Neural Information Processing Systems36 (2024)
2024
-
[14]
Nishant Jha, Deepak Prashar, Mamoon Rashid, Sachin Kumar Gupta, and R.K. Saket. 2021. Electricity load forecasting and feature extraction in smart grid using neural networks.Computers & Electrical Engineering96 (2021), 107479. doi:10.1016/j.compeleceng.2021.107479
-
[15]
Lin Jiang, Yu Yang, and Guang Wang. 2025. HCRide: harmonizing passenger fairness and driver preference for human-centered ride-hailing. InProceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence. 10289– 10297
2025
-
[16]
Lin Jiang, Dahai Yu, Rongchao Xu, Tian Tang, and Guang Wang. 2025. Uncertainty-aware predict-then-optimize framework for equitable post-disaster power restoration. InProceedings of the Thirty-Fourth International Joint Confer- ence on Artificial Intelligence. 9719–9727
2025
-
[17]
Shiyong Lan, Yitong Ma, Weikang Huang, Wenwu Wang, Hongyu Yang, and Pyang Li. 2022. Dstagnn: Dynamic spatial-temporal aware graph neural network for traffic flow forecasting. InInternational conference on machine learning. PMLR, 11906–11917
2022
-
[18]
Lincan Li, Zheng Chen, and Yushun Dong. 2026. LLM as Clinical Graph Struc- ture Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis. arXiv:2604.28178 [cs.AI] https://arxiv.org/abs/2604.28178
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[19]
Lincan Li, Eren Erman Ozguven, Yue Zhao, Guang Wang, Yiqun Xie, and Yushun Dong. 2025. TyphoFormer: Language-Augmented Transformer for Accurate Typhoon Track Forecasting. InProceedings of the 33rd ACM Interna- tional Conference on Advances in Geographic Information Systems (SIGSPATIAL ’25). Association for Computing Machinery, New York, NY, USA, 1174–1177...
-
[20]
Lincan Li, Kaixiang Yang, Jichao Bi, and Fengji Luo. 2024. STS-CCL: Spatial- Temporal Synchronous Contextual Contrastive Learning for Urban Traffic Fore- casting. InICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6705–6709. doi:10.1109/ICASSP48485.2024. 10446624
-
[21]
Weiyu Li, Qi Wang, Yuanyuan Liu, Mario L Small, and Jianxi Gao. 2022. A spatiotemporal decay model of human mobility when facing large-scale crises. Proceedings of the National Academy of Sciences119, 33 (2022), e2203042119
2022
-
[22]
Xinjin Li, Jinghan Cao, Mengyue Wang, Yue Wu, Longxiang Yan, Yeyang Zhou, Ziqi Sha, and Yu Ma. 2026. FAST: A Synergistic Framework of Attention and State- space Models for Spatiotemporal Traffic Prediction. arXiv:2604.13453 [cs.LG] https://arxiv.org/abs/2604.13453
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[23]
Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. Diffusion convolu- tional recurrent neural network: Data-driven traffic forecasting. InInternational Conference on Learning Representations. https://arxiv.org/abs/1707.01926
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[24]
Zhonghang Li, Lianghao Xia, Jiabin Tang, Yong Xu, Lei Shi, Long Xia, Dawei Yin, and Chao Huang. 2024. Urbangpt: Spatio-temporal large language models. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 5351–5362
2024
-
[25]
Min Liang, Yongli Hu, Haoen Weng, Jiayang Xi, and Baocai Yin. 2025. EnergyGPT: Fine-tuning large language model for multi-energy load forecasting.Renewable Energy(2025), 123313
2025
-
[26]
Bryan Lim, Sercan Ö Arık, Nicolas Loeff, and Tomas Pfister. 2021. Temporal fusion transformers for interpretable multi-horizon time series forecasting.International journal of forecasting37, 4 (2021), 1748–1764
2021
-
[27]
Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, and Ge Li. 2024. St-llm: Large language models are effective temporal learners. InEuropean Conference on Computer Vision. Springer, 1–18
2024
-
[28]
Jun Ma, Feifei Li, and Bo Wang. 2024. U-mamba: Enhancing long-range depen- dency for biomedical image segmentation.arXiv preprint arXiv:2401.04722(2024). doi:10.48550/arXiv.2401.04722
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2401.04722 2024
-
[29]
Patrick AP Moran. 1950. Notes on continuous stochastic phenomena.Biometrika 37, 1/2 (1950), 17–23
1950
-
[30]
California Independent System Operator. 2024. Real-Time Load Data. https: //www.caiso.com/TodaysOutlook/Pages/default.aspx
2024
-
[31]
New York Independent System Operator. 2024. Real-Time Load Data. https: //www.nyiso.com/load-data
2024
-
[32]
Darius Peteleaza, Alexandru Matei, Radu Sorostinean, Arpad Gellert, Ugo Fiore, Bala-Constantin Zamfirescu, and Francesco Palmieri. 2024. Electricity consump- tion forecasting for sustainable smart cities using machine learning methods. Internet of Things27 (2024), 101322
2024
-
[33]
Francesco Piccialli, Salvatore Cuomo, Danilo Crisci, Edoardo Prezioso, and Gang Mei. 2020. A deep learning approach for facility patient attendance prediction based on medical booking data.Scientific Reports10, 1 (2020), 14623
2020
-
[34]
Lilian Pun, Pengxiang Zhao, and Xintao Liu. 2019. A Multiple Regression Approach for Traffic Flow Estimation.IEEE Access7 (2019), 35998–36009. doi:10.1109/ACCESS.2019.2904645
-
[35]
FÉlix R Quintela, Roberto C Redondo, Norberto R Melchor, and Margarita Re- dondo. 2009. A general approach to Kirchhoff’s Laws.IEEE Transactions on Education52, 2 (2009), 273–278
2009
-
[36]
Yaniv Romano, Evan Patterson, and Emmanuel Candes. 2019. Conformalized quantile regression.Advances in neural information processing systems32 (2019)
2019
-
[37]
Nicholas I Sapankevych and Ravi Sankar. 2009. Time series prediction using support vector machines: a survey.IEEE computational intelligence magazine4, 2 (2009), 24–38
2009
-
[38]
Abhin Shah, Yuheng Bu, Joshua K Lee, Subhro Das, Rameswar Panda, Prasanna Sattigeri, and Gregory W Wornell. 2022. Selective regression under fairness criteria. InInternational Conference on Machine Learning. PMLR, 19598–19615. https://proceedings.mlr.press/v162/shah22a.html
2022
- [39]
-
[40]
Bolin Shen, Eren Ozguven, Yue Zhao, Guang Wang, Yiqun Xie, and Yushun Dong
-
[41]
InProceedings of the 1st ACM SIGSPATIAL International Workshop on Spatial Intelligence for Smart and Connected Communities
Learning from the Storm: A Multivariate Machine Learning Approach to Predicting Hurricane-Induced Economic Losses. InProceedings of the 1st ACM SIGSPATIAL International Workshop on Spatial Intelligence for Smart and Connected Communities. 1–4
- [42]
-
[43]
Anna Sokol, Nuno Moniz, and Nitesh Chawla. 2024. Conformalized Selective Regression.arXiv preprint arXiv:2402.16300(2024). doi:10.48550/arXiv.2402.16300
-
[44]
Shihao Tu, Yupeng Zhang, Jing Zhang, Zhendong Fu, Yin Zhang, and Yang Yang
-
[45]
Powerpm: Foundation model for power systems.Advances in Neural Information Processing Systems37 (2024), 115233–115260
2024
-
[46]
Chloe Wang, Oleksii Tsepa, Jun Ma, and Bo Wang. 2024. Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces.arXiv preprint arXiv:2402.00789(2024). doi:10.48550/arXiv.2402.00789
-
[47]
Xiaoyu Wu, Jinghan He, Pei Zhang, and Jun Hu. 2015. Power system short-term load forecasting based on improved random forest with grey relation projection. Automation of Electric Power Systems39, 12 (2015), 50–55
2015
-
[48]
Canran Xiao and Yongmei Liu. 2025. A multifrequency data fusion deep learning model for carbon price prediction.Journal of Forecasting44, 2 (2025), 436–458. EnergyMamba KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea
2025
-
[49]
Rongchao Xu, Kunlin Cai, Lin Jiang, Zhiqing Hong, Yuan Tian, and Guang Wang. 2026. GeoGen: A Two-stage Coarse-to-Fine Framework for Fine-grained Synthetic Location-based Social Network Trajectory Generation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 1373–1381
2026
-
[50]
Rongchao Xu, Zhiqing Hong, and Guang Wang. 2025. AutoSTDiff: Autoregres- sive Spatio-Temporal Denoising Diffusion Model for Asynchronous Trajectory Generation. InProceedings of the 2025 SIAM International Conference on Data Mining (SDM). SIAM, 538–547
2025
-
[51]
Rongchao Xu, Lin Jiang, Dahai Yu, Ximiao Li, and Guang Wang. 2026. SynHAT: A Two-stage Coarse-to-Fine Diffusion Framework for Synthesizing Human Activity Traces.arXiv preprint arXiv:2604.14705(2026)
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[52]
Peiyu Yang, Naveed Akhtar, Mubarak Shah, and Ajmal Mian. 2024. Regulating model reliance on non-robust features by smoothing input marginal density. In European Conference on Computer Vision. Springer, 329–347
2024
-
[53]
Peiyu Yang, Naveed Akhtar, Zeyi Wen, Mubarak Shah, and Ajmal Saeed Mian
-
[54]
InInternational Conference on Learning Representations
Re-calibrating feature attributions for model interpretation. InInternational Conference on Learning Representations
-
[55]
Bing Yu, Haoteng Yin, and Zhanxing Zhu. 2018. Spatio-temporal graph convolu- tional networks: a deep learning framework for traffic forecasting.Proceedings of the 27th International Joint Conference on Artificial Intelligence(2018), 3634–3640
2018
-
[56]
Dahai Yu, Lin Jiang, Rongchao Xu, and Guang Wang. 2026. HealthMamba: An Uncertainty-aware Spatiotemporal Graph State Space Model for Effective and Reliable Healthcare Facility Visit Prediction. arXiv:2602.05286 [cs.LG] https: //arxiv.org/abs/2602.05286
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[57]
Dahai Yu, Rongchao Xu, Dingyi Zhuang, Yuheng Bu, Shenhao Wang, and Guang Wang. 2026. TrustEnergy: A Unified Framework for Accurate and Reliable User- level Energy Usage Prediction.Proceedings of the AAAI Conference on Artificial Intelligence40, 46 (Mar. 2026), 39558–39566. doi:10.1609/aaai.v40i46.41307
-
[58]
Dahai Yu, Dingyi Zhuang, Lin Jiang, Rongchao Xu, Xinyue Ye, Yuheng Bu, Shen- hao Wang, and Guang Wang. 2025. UQGNN: Uncertainty Quantification of Graph Neural Networks for Multivariate Spatiotemporal Prediction. InProceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems. 52–65
2025
- [59]
-
[60]
Biao Zhang and Rico Sennrich. 2019. Root mean square layer normalization. Advances in neural information processing systems32 (2019)
2019
-
[61]
Zijian Zhang, Rong Fu, Yangfan He, Xinze Shen, Yanlong Wang, Xiaojing Du, Haochen You, Keyan Jin, Jiazhao Shi, and Simon Fong. 2026. FinSentLLM: Multi- LLM and structured semantic signals for enhanced financial sentiment forecasting. InICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 17682–17686
2026
-
[62]
Shaobo Zhong and Zhanhui Sun. 2010. Challenges and opportunities in emer- gency management of electric power system blackout. In2010 International Conference on E-Product E-Service and E-Entertainment. IEEE, 1–4
2010
-
[63]
Dingyi Zhuang, Shenhao Wang, Haris Koutsopoulos, and Jinhua Zhao. 2022. Uncertainty Quantification of Sparse Travel Demand Prediction with Spatial- Temporal Graph Neural Networks. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 463...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.