Recognition: unknown
An End-to-end Building Load Forecasting Framework with Patch-based Information Fusion Network and Error-weighted Adaptive Loss
Pith reviewed 2026-05-10 12:34 UTC · model grok-4.3
The pith
A patch-based fusion network and error-weighted loss improve building load forecasts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that an end-to-end framework consisting of local-outlier-factor anomaly correction, SVM-based feature selection, a patch-based information fusion network that processes local blocks with shared GRU layers and residual connections before dynamically weighting them through a customized gate, and an error-weighted adaptive loss that scales penalties according to real-time error distributions, produces more accurate and robust building load predictions than existing methods, particularly under extreme conditions.
What carries the argument
The patch-based information fusion network (PIF-Net) that divides the input series into patches, extracts temporal features via shared GRU units with residual connections, and fuses the hidden states with a gating module to weight patch importance, together with the error-weighted adaptive loss (EWAL) that combines rational quadratic and logarithmic terms to adjust penalties based on current error distribution.
If this is right
- The full framework yields higher prediction accuracy than baseline models on building load data.
- The adaptive loss component specifically improves robustness when loads reach extreme values.
- Preprocessing reduces the effect of anomalies and redundant variables before forecasting begins.
- End-to-end training allows the network to learn from cleaned and selected inputs without separate stages.
- The resulting forecasts can support more reliable demand-response decisions in energy systems.
Where Pith is reading between the lines
- The same patching and gating approach could be tested on other volatile time series such as grid-level demand or renewable generation.
- Replacing the GRU backbone with other recurrent or attention layers might reveal whether the fusion module improves performance independently of the base sequence model.
- Running the framework on additional buildings with different climate and usage profiles would check whether the reported gains hold beyond the original test sets.
Load-bearing premise
That the accuracy gains come from the specific combination of preprocessing steps, patch fusion architecture, and adaptive loss rather than from hyperparameter tuning that any standard model could also receive.
What would settle it
A head-to-head test on the same building datasets where a conventional forecasting model, after identical hyperparameter search, matches or exceeds the proposed framework's error metrics during periods of extreme load.
Figures
read the original abstract
Accurate building load forecasting plays a critical role in facilitating demand response aggregation and optimizing energy management. However, the complex temporal dependencies and high volatility of building loads limit the improvement of prediction accuracy. To this end, we propose a novel end-to-end building load forecasting framework. Specifically, the framework can be divided into two main stages. In the two-stage data preprocessing module enhanced by interpretable feature selection, we utilize the Local Outlier Factor (LOF) algorithm to accurately detect and correct anomalies in the original building load series. Furthermore, we employ SVM-SHAP feature analysis to quantify the impact of environmental variables, filtering out critical feature combinations to mitigate redundancy. In the building load forecasting module, we propose the patch-based information fusion network (PIF-Net). This model applies patching technology to process input series into local blocks, extracting temporal features through a shared Gated Recurrent Unit (GRU) network with residual connections. Subsequently, an information fusion module based on a customized gating mechanism integrates the ensemble hidden states to weight the importance of different temporal patches dynamically. Additionally, the framework is trained using a novel Error-weighted Adaptive Loss (EWAL) function. By combining a rational quadratic function and logarithmic loss to dynamically adjust penalty weights based on real-time prediction error distributions, EWAL significantly enhances the model's robustness under extreme load conditions. Finally, extensive experiments demonstrate the effectiveness and superiority of our proposed framework.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a two-stage end-to-end building load forecasting framework. The first stage applies Local Outlier Factor (LOF) for anomaly detection/correction and SVM-SHAP for interpretable feature selection on environmental variables. The second stage introduces Patch-based Information Fusion Network (PIF-Net), which patches the input series, processes patches with a shared residual GRU, and fuses hidden states via a custom gating mechanism; training uses Error-weighted Adaptive Loss (EWAL) that combines rational quadratic and log losses to dynamically weight errors. The central claim is that extensive experiments demonstrate the framework's superiority over baselines and that EWAL specifically improves robustness under extreme load conditions.
Significance. If the experimental controls hold, the work could offer a practical advance for volatile building-load prediction by combining interpretable preprocessing with patch-wise temporal modeling and adaptive loss. The EWAL formulation and the explicit use of SHAP for feature filtering are potentially reusable ideas for other high-volatility time-series tasks in energy systems.
major comments (2)
- [§5 (Experiments)] §5 (Experiments) and Table X (main results): the reported superiority of PIF-Net + EWAL over GRU/Transformer/Informer baselines is not yet load-bearing because the manuscript does not state that every baseline received identical LOF + SVM-SHAP preprocessing and the same hyperparameter-search budget. Without this control, the headline gains cannot be attributed to the patch fusion or EWAL rather than to the two-stage preprocessing pipeline.
- [§4.3 (EWAL)] §4.3 (EWAL) and the extreme-load ablation: the claim that EWAL 'significantly enhances robustness under extreme load conditions' requires a dedicated quantitative breakdown (e.g., MAE on the top 5 % error quantile or on days with load spikes > 2σ) rather than only aggregate metrics; the current description leaves open whether the improvement is statistically significant or merely an artifact of the rational-quadratic component.
minor comments (2)
- [§4.2] The notation for the gating mechanism in PIF-Net (Eq. (7)–(9)) uses several ad-hoc symbols (α, β, γ) without a consolidated table; a single symbol table would improve readability.
- [Figure 3] Figure 3 (model architecture) would benefit from explicit labeling of the residual connections and the information-fusion block to match the textual description.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help us improve the clarity and rigor of our work. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and additional analyses.
read point-by-point responses
-
Referee: §5 (Experiments) and Table X (main results): the reported superiority of PIF-Net + EWAL over GRU/Transformer/Informer baselines is not yet load-bearing because the manuscript does not state that every baseline received identical LOF + SVM-SHAP preprocessing and the same hyperparameter-search budget. Without this control, the headline gains cannot be attributed to the patch fusion or EWAL rather than to the two-stage preprocessing pipeline.
Authors: We confirm that all baseline models (GRU, Transformer, Informer) were trained with exactly the same LOF-based anomaly detection/correction and SVM-SHAP feature selection pipeline, as well as the identical hyperparameter search budget and validation protocol described in Section 5. This ensures the performance differences are attributable to the PIF-Net architecture and EWAL rather than preprocessing. We acknowledge that this equivalence was not stated explicitly in the original text. In the revised manuscript we will add a dedicated paragraph in §5.1 and a footnote to Table X clarifying that every baseline received identical preprocessing and search resources. revision: yes
-
Referee: §4.3 (EWAL) and the extreme-load ablation: the claim that EWAL 'significantly enhances robustness under extreme load conditions' requires a dedicated quantitative breakdown (e.g., MAE on the top 5 % error quantile or on days with load spikes > 2σ) rather than only aggregate metrics; the current description leaves open whether the improvement is statistically significant or merely an artifact of the rational-quadratic component.
Authors: We agree that aggregate metrics alone are insufficient to substantiate the robustness claim. In the revised version we will add a new subsection (or expanded Table in §5.3) reporting MAE on the top 5 % error quantile, on days with load spikes > 2σ, and on the top 1 % tail, together with paired statistical significance tests (Wilcoxon signed-rank) comparing EWAL against the rational-quadratic-only and log-only ablations. These results will be computed on the same test splits used for the main tables, directly addressing whether the gains are statistically meaningful or an artifact of any single loss component. revision: yes
Circularity Check
No circularity: explicit model components and empirical validation are self-contained
full rationale
The paper proposes an end-to-end framework consisting of explicit, independently defined stages: LOF-based anomaly correction, SVM-SHAP feature filtering, PIF-Net (patching + shared residual GRU + custom gating fusion), and EWAL loss (rational quadratic + log combination). These are presented as constructed architectural choices rather than derived quantities that reduce to their inputs by definition or self-citation. No equations are shown that equate a 'prediction' to a fitted parameter, no uniqueness theorem is invoked from prior self-work, and the superiority claim rests on external experimental comparison rather than internal tautology. The derivation chain therefore remains non-circular and externally falsifiable via the reported benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- various hyperparameters in PIF-Net and EWAL
axioms (2)
- domain assumption Building load data contains anomalies that can be accurately detected and corrected using the Local Outlier Factor algorithm
- domain assumption Environmental variables have quantifiable impacts on load that can be filtered using SVM-SHAP without losing critical information
invented entities (2)
-
Patch-based Information Fusion Network (PIF-Net)
no independent evidence
-
Error-weighted Adaptive Loss (EWAL)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Y. Han, Y. Hao, M. Feng, K. Chen, R. Xing, Y. Liu, X. Lin, B. Ma, J. Fan, Z. Geng, Novel stattention graphwavenet model for residential household appliance prediction and energy structure optimization, Energy 307 (2024) 132582
2024
-
[3]
F. Li, Z. Wan, T. Koch, G. Zan, M. Li, Z. Zheng, B. Liang, Improving the accuracy of multi- step prediction of building energy consumption based on eemd-pso-informer and long-time series, Computers and Electrical Engineering 110 (2023) 108845
2023
-
[4]
M.Q.Raza,A.Khosravi,Areviewonartificialintelligencebasedloaddemandforecastingtechniques for smart grid and buildings, Renewable and Sustainable Energy Reviews 50 (2015) 1352–1372
2015
-
[5]
Neubauer, S
A. Neubauer, S. Brandt, M. Kriegel, Relationship between feature importance and building charac- teristics for heating load predictions, Applied Energy 359 (2024) 122668
2024
-
[6]
W. Wang, H. Shimakawa, B. Jie, M. Sato, A. Kumada, Be-lstm: An lstm-based framework for feature selection and building electricity consumption prediction on small datasets, Journal of Building Engineering 102 (2025) 111910
2025
-
[7]
Zhang, J
X. Zhang, J. Wang, K. Zhang, Short-term electric load forecasting based on singular spectrum analysisandsupportvectormachineoptimizedbycuckoosearchalgorithm,ElectricPowerSystems Research 146 (2017) 270–285
2017
-
[8]
Zheng, D
R. Zheng, D. Yang, L. Gu, R. Shen, Q. An, J. Zhao, J. Wu, Towards short-term load forecasting for office buildings: A deep learning approach combined with occupant behavior model, Energy Conversion and Management 350 (2026) 120983
2026
-
[9]
Q.Zhang,Z.Tian,Z.Ma,G.Li,Y.Lu,J.Niu,Developmentoftheheatingloadpredictionmodelfor the residential building of district heating based on model calibration, Energy 205 (2020) 117949
2020
-
[10]
J. E. Pachano, C. F. Bandera, Multi-step building energy model calibration process based on measured data, Energy and Buildings 252 (2021) 111380
2021
-
[11]
Z.Yu,C.Song,Y.Liu,D.Wang,B.Li,Abottom-upapproachforcommunityloadpredictionbased on multi-agent model, Sustainable Cities and Society 97 (2023) 104774
2023
-
[12]
Y. Chen, M. Guo, Z. Chen, Z. Chen, Y. Ji, Physical energy and data-driven models in building energy prediction: A review, Energy Reports 8 (2022) 2656–2671
2022
-
[13]
J. E. Pachano, C. Nuevo-Gallardo, C. Fernández Bandera, An empirical comparison of a calibrated white-boxversusmultiplelstmblack-boxbuildingenergymodels,EnergyandBuildings333(2025) 115485
2025
-
[14]
C. Deb, F. Zhang, J. Yang, S. E. Lee, K. W. Shah, A review on time series forecasting techniques for building energy consumption, Renewable and Sustainable Energy Reviews 74 (2017) 902–924. 24
2017
-
[15]
Mohamed, M
N. Mohamed, M. H. Ahmad, Suhartono, Z. Ismail, Improving short term load forecasting using double seasonal arima model, World Applied Sciences Journal 15 (2) (2011) 223–231
2011
-
[16]
L. Yang, H. Yang, A combined arima-ppr model for short-term load forecasting, in: 2019 IEEE Innovative Smart Grid Technologies - Asia (ISGT Asia), 2019, pp. 3363–3367
2019
-
[17]
A.V.Kychkin,G.C.Chasparis,Featureandmodelselectionforday-aheadelectricity-loadforecast- ing in residential buildings, Energy and Buildings 249 (2021) 111200
2021
-
[18]
F.Pallonetto,C.Jin,E.Mangina,Forecastelectricitydemandincommercialbuildingwithmachine learning models to enable demand response programs, Energy and AI 7 (2022) 100121
2022
-
[19]
C.Kuster,Y.Rezgui,M.Mourshed,Electricalloadforecastingmodels: Acriticalsystematicreview, Sustainable cities and society 35 (2017) 257–270
2017
-
[20]
Y.Chen,P.Xu,Y.Chu,W.Li,Y.Wu,L.Ni,Y.Bao,K.Wang,Short-termelectricalloadforecasting using thesupport vector regression (svr)model to calculate thedemand response baseline foroffice buildings, Applied Energy 195 (2017) 659–670
2017
-
[21]
Chaganti, F
R. Chaganti, F. Rustam, T. Daghriri, I. d. l. T. Díez, J. L. V. Mazón, C. L. Rodríguez, I. Ashraf, Building heating and cooling load prediction using ensemble machine learning model, Sensors 22 (19) (2022) 7692
2022
-
[22]
L.T.Le,H.Nguyen,J.Zhou,J.Dou,H.Moayedi,Estimatingtheheatingloadofbuildingsforsmart city planning using a novel artificial intelligence technique pso-xgboost, Applied Sciences 9 (13) (2019) 2714
2019
-
[23]
H. Fan, Y. Chai, C. Liu, W. Liu, Z. Zhang, W. Run, D. Liu, Ev-stllm: Electric vehicle charging forecasting based on spatio-temporal large language models with multi-frequency and multi-scale information fusion, Expert Systems with Applications 313 (2026) 131620
2026
-
[24]
H. Fan, W. Liu, Z. Zhang, W. Run, Y. Duan, D. Liu, Epformer: Unlocking day-ahead electricity price forecasting accuracy using the time–frequency domain feature learning strategy considering renewable energy, Renewable Energy 261 (2026) 125296
2026
-
[25]
M. W. Ahmad, M. Mourshed, Y. Rezgui, Trees vs neurons: Comparison between random forest and ann for high-resolution prediction of building energy consumption, Energy and buildings 147 (2017) 77–89
2017
-
[26]
W. Guo, L. Che, M. Shahidehpour, X. Wan, Machine-learning based methods in short-term load forecasting, The Electricity Journal 34 (1) (2021) 106884
2021
-
[27]
A.L’Heureux,K.Grolinger,M.A.Capretz,Transformer-basedmodelforelectricalloadforecasting, Energies 15 (14) (2022) 4993
2022
-
[28]
Neubauer, M
A. Neubauer, M. Yu, P. Babakhani, S. Brandt, M. Kriegel, Transfer learning and explainable ai for heating load forecasting: A large-scale benchmark with shap-based static features, Energy and AI (2026) 100722. 25
2026
-
[29]
Q. Qin, R. Evins, Deep temporal convolutional residual neural network (resnet)–based surrogate models for time-series load prediction and fast demand response evaluation across buildings and climates, Energy Conversion and Management 354 (2026) 121278
2026
-
[30]
M. M. Breunig, H.-P. Kriegel, R. T. Ng, J. Sander, Lof: identifying density-based local outliers, in: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 2000, pp. 93–104
2000
-
[31]
S.M.Lundberg,S.-I.Lee,Aunifiedapproachtointerpretingmodelpredictions,Advancesinneural information processing systems 30 (2017)
2017
-
[32]
Miller, A
C. Miller, A. Kathirgamanathan, B. Picchetti, P. Arjunan, J. Y. Park, Z. Nagy, P. Raftery, B. W. Hobson,Z.Shi,F.Meggers,Thebuildingdatagenomeproject2,energymeterdatafromtheashrae great energy predictor iii competition, Scientific data 7 (1) (2020) 368. 26
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.