pith. sign in

arxiv: 2409.05884 · v2 · submitted 2024-08-26 · 💻 cs.CY · cs.LG

Integrating the Expected Future in Load Forecasts with Contextually Enhanced Transformer Models

Pith reviewed 2026-05-23 22:09 UTC · model grok-4.3

classification 💻 cs.CY cs.LG
keywords energy forecastingtransformer modelscontextual informationtimetable dataload forecastingsequence-to-sequencerailway energybuilding energy
0
0 comments X

The pith

Contextually enhanced transformer models cut energy load forecast errors by incorporating future timetable and occupancy data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper frames energy load forecasting as a sequence-to-sequence task that combines historical patterns with forward-looking contextual inputs such as timetables and planned occupancy. It introduces transformer architectures modified to ingest this dynamic planning information directly. In a nationwide railway energy case study the addition of timetable data produces an average 26.6 percent drop in mean absolute error; a separate building-energy study using office occupancy schedules yields a 56.3 percent average reduction. The method is shown to outperform other current sequence models on the same data.

Core claim

By treating load forecasting as a combined forecasting-regression problem solved with sequence-to-sequence transformers, the models integrate both past observations and explicit future contextual signals (timetables, occupancy plans) and thereby reduce mean absolute error by 26.6 percent on railway energy consumption and by 56.3 percent on building energy consumption while also lowering the frequency of large errors.

What carries the argument

Contextually-enhanced transformer models that accept both historical time series and dynamic forward-looking planning sequences as joint inputs to a sequence-to-sequence architecture.

If this is right

  • Large forecast errors become less frequent, easing operational strain on power grids.
  • Intra-day trading costs tied to forecast inaccuracy decline.
  • The same architecture applies to other domains that possess advance schedules, such as building or industrial load.
  • Performance gains hold across multiple state-of-the-art baseline models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Utilities could reduce reserve margins if the contextual inputs remain accurate farther into the future.
  • The approach may extend to traffic or water-demand forecasting where similar planning documents exist.
  • Real-time updates to timetable data could be streamed into the model to produce rolling corrections.

Load-bearing premise

Reliable, high-resolution contextual planning data such as timetables or occupancy schedules will be available at the forecast horizon and can be fed into the model without introducing new errors.

What would settle it

Applying the same contextually-enhanced transformer to a fresh railway energy dataset where timetable data is deliberately withheld or corrupted and checking whether the mean absolute error reduction disappears or reverses.

Figures

Figures reproduced from arXiv: 2409.05884 by Giovanni Sansavini, Leandro von Krannichfeldt, Michael F. Howland, Olga Fink, Raffael Theiler.

Figure 1
Figure 1. Figure 1: Illustration of the proposed load forecasting framework with contextually enhanced trans [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Normalized Mean Absolute Error (NMAE) in normalized megawatts with and without [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of the robustness of contextually enhanced transformer models: Crossformer [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Model Performance Case Study: Swiss National Holiday (August 1, 2023) This de￾tailed study focuses on the Swiss National Holiday event in the Railway dataset. For the individ￾ual contextually enhanced transformer Crossformer (CF), Spacetimeformer (STF) and Timeseries Transformer (TST) we show scatter plots relating forecasted values to ground truth for the entire test set. In the first row we show the mode… view at source ↗
Figure 5
Figure 5. Figure 5: Model Performance case study for the Building Energy dataset. We overlay the building energy profile with the day-ahead forecasts (48 time steps) of contextually enhanced transformer models (Crossformer (CF), Spacetimeformer (STF) and Timeseries Transformer (TST)). We plot forecasts with and without future contextual information(-FCI). To highlight the impact of different future context sources, we separat… view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of the performance of contextually enhanced transformer models: Cross [PITH_FULL_IMAGE:figures/full_fig_p027_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of the forecasting performance of contextually enhanced Crossformer (CF), [PITH_FULL_IMAGE:figures/full_fig_p028_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of the forecasting performance of contextually enhanced Crossformer (CF), [PITH_FULL_IMAGE:figures/full_fig_p029_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of the forecasting performance of contextually enhanced Crossformer (CF), [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of the forecasting performance of contextually enhanced Crossformer (CF), [PITH_FULL_IMAGE:figures/full_fig_p031_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: All additional random seeds for the Figure [PITH_FULL_IMAGE:figures/full_fig_p032_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: May 2023 (May 1st holiday) A typical load profile overlaid with the next day forecasts [PITH_FULL_IMAGE:figures/full_fig_p032_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: July 2023 (start of summer break) A typical load profile overlaid with the next day [PITH_FULL_IMAGE:figures/full_fig_p033_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: August 2023 (national holiday) A typical load profile overlaid with the next day fore [PITH_FULL_IMAGE:figures/full_fig_p033_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Outlier counts by forecasting model plotted against the MAPE threshold for the full [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Outlier severity distributions by MAPE for the [PITH_FULL_IMAGE:figures/full_fig_p035_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Outlier severity distributions by MAPE for the [PITH_FULL_IMAGE:figures/full_fig_p036_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Data distribution shift of selected features during the COVID-19 period. [PITH_FULL_IMAGE:figures/full_fig_p036_18.png] view at source ↗
read the original abstract

Accurate and reliable energy forecasting is essential for power grid operators who strive to minimize extreme forecasting errors that pose significant operational challenges and incur high intra-day trading costs. Incorporating planning information -- such as anticipated user behavior, scheduled events or timetables -- provides substantial contextual information to enhance forecast accuracy and reduce the occurrence of large forecasting errors. Existing approaches, however, lack the flexibility to effectively integrate both dynamic, forward-looking contextual inputs and historical data. In this work, we conceptualize forecasting as a combined forecasting-regression task, formulated as a sequence-to-sequence prediction problem, and introduce contextually-enhanced transformer models designed to leverage all contextual information effectively. We demonstrate the effectiveness of our approach through a primary case study on nationwide railway energy consumption forecasting, where integrating contextual information into transformer models, particularly timetable data, resulted in a significant average mean absolute error reduction of 26.6%. An auxiliary case study on building energy forecasting, leveraging planned office occupancy data, further illustrates the generalizability of our method, showing an average reduction of 56.3% in mean absolute error. Compared to other state-of-the-art methods, our approach consistently outperforms existing models, underscoring the value of context-aware deep learning techniques in energy forecasting applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper conceptualizes energy load forecasting as a sequence-to-sequence task and introduces contextually-enhanced transformer models that integrate historical time series with forward-looking contextual inputs such as timetables and planned occupancy schedules. It reports that this integration yields average MAE reductions of 26.6% on nationwide railway energy consumption forecasting and 56.3% on an auxiliary building energy forecasting case study, with consistent outperformance versus other state-of-the-art methods.

Significance. If the empirical gains hold under realistic conditions, the work would demonstrate a practical way to reduce large forecasting errors that drive intra-day trading costs for grid operators and building managers. The dual-domain case studies provide initial evidence of generalizability for context-aware deep learning in energy applications.

major comments (2)
  1. Abstract: the headline MAE reductions (26.6% and 56.3%) are stated without any description of experimental design, baseline implementations, data splits, cross-validation procedure, or statistical testing, preventing verification that the gains are not artifacts of particular hyperparameter choices or data selection.
  2. Abstract: the method feeds planned timetable and occupancy data directly into the seq2seq formulation as known future context. No robustness experiments are described that inject realistic noise, missing values, or schedule deviations into these contextual inputs, leaving the central performance claim dependent on an untested assumption of perfect future context availability.
minor comments (1)
  1. Abstract: the phrase 'contextually-enhanced transformer models' is used without a concise statement of the architectural modification (e.g., how context tokens are embedded or attended to).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation of results.

read point-by-point responses
  1. Referee: Abstract: the headline MAE reductions (26.6% and 56.3%) are stated without any description of experimental design, baseline implementations, data splits, cross-validation procedure, or statistical testing, preventing verification that the gains are not artifacts of particular hyperparameter choices or data selection.

    Authors: We agree that the abstract would benefit from additional context on the experimental methodology. The full manuscript provides these details in Section 4 (Datasets and Experimental Setup), including real-world data sources for railway and building cases, chronological train/test splits, baseline models (e.g., LSTM, vanilla Transformer, other SOTA), and evaluation via multiple runs for robustness. We will revise the abstract to briefly note the use of standard time-series splits, comparisons to established baselines, and multi-run evaluation. revision: yes

  2. Referee: Abstract: the method feeds planned timetable and occupancy data directly into the seq2seq formulation as known future context. No robustness experiments are described that inject realistic noise, missing values, or schedule deviations into these contextual inputs, leaving the central performance claim dependent on an untested assumption of perfect future context availability.

    Authors: The approach is designed around the realistic availability of planned contextual data (timetables and occupancy schedules) as known future inputs in operational settings. The manuscript does not currently include explicit robustness experiments with injected noise or deviations. We will add a new analysis subsection evaluating performance under simulated imperfections (e.g., missing values and schedule deviations) to quantify sensitivity and support the claims. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical MAE gains from context integration are not reduced to fitted inputs by construction

full rationale

The paper frames its contribution as an empirical demonstration via two case studies (railway energy and building occupancy) comparing contextually-enhanced transformers against baselines, reporting average MAE reductions of 26.6% and 56.3%. No equations, derivations, or self-citations are presented that define a quantity in terms of itself or rename a fitted parameter as a prediction. The central claims rest on direct experimental comparisons rather than any load-bearing mathematical reduction or uniqueness theorem imported from prior author work. This is the expected non-finding for an applied ML forecasting paper whose value is measured by out-of-sample performance metrics.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract provides no equations or implementation details, so the ledger is limited to high-level modeling assumptions required for the claimed gains.

free parameters (1)
  • transformer hyperparameters and training settings
    Standard deep-learning models contain many tunable parameters whose values are fitted on the training data and directly affect reported MAE reductions.
axioms (1)
  • domain assumption Contextual planning data (timetables, occupancy) is accurate and available at forecast time
    The method's performance claims rest on the premise that such forward-looking inputs exist and can be fed into the model without additional error.

pith-pipeline@v0.9.0 · 5761 in / 1223 out tokens · 23512 ms · 2026-05-23T22:09:36.774425+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

101 extracted references · 101 canonical work pages

  1. [1]

    https://company.sbb.ch/de/sbb-als- geschaeftspartner/leistungen-evu/energie/bahn-haushaltsstrom.html

    Bahn- und Haushaltsstrom | SBB. https://company.sbb.ch/de/sbb-als- geschaeftspartner/leistungen-evu/energie/bahn-haushaltsstrom.html

  2. [2]

    https://ec.europa.eu/commission/presscorner/detail/en/ip 07 110

    Blackout of November 2006: Important lessons to be drawn. https://ec.europa.eu/commission/presscorner/detail/en/ip 07 110

  3. [3]

    https://www.iea.org/reports/unlocking-the-potential-of-distributed-energy- resources/executive-summary

    Executive summary – Unlocking the Potential of Distributed Energy Resources – Analysis. https://www.iea.org/reports/unlocking-the-potential-of-distributed-energy- resources/executive-summary

  4. [4]

    Understanding and Managing the Unknown: The Nature of Uncertainty in Plan- ning

    John Abbott. Understanding and Managing the Unknown: The Nature of Uncertainty in Plan- ning. Journal of Planning Education and Research, 24(3):237–251, March 2005. ISSN 0739- 456X. doi: 10.1177/0739456X04267710

  5. [5]

    TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting, March 2024

    Md Atik Ahamed and Qiang Cheng. TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting, March 2024

  6. [6]

    A survey comparing centralized and decentralized electricity markets

    Victor Ahlqvist, P ¨ar Holmberg, and Thomas Tanger ˚as. A survey comparing centralized and decentralized electricity markets. Energy Strategy Reviews, 40:100812, March 2022. ISSN 2211-467X. doi: 10.1016/j.esr.2022.100812. 1The timestamp information is not included due to the explicit time series modeling. Also note that in the building energy case study, ...

  7. [7]

    Load Forecasting Techniques for Power System: Research Challenges and Survey

    Naqash Ahmad, Yazeed Ghadi, Muhammad Adnan, and Mansoor Ali. Load Forecasting Techniques for Power System: Research Challenges and Survey. IEEE Access, 10:71054– 71090, 2022. ISSN 2169-3536. doi: 10.1109/ACCESS.2022.3187839. URL https: //ieeexplore.ieee.org/document/9812604

  8. [8]

    A review on renewable energy and electricity requirement forecasting models for smart grid and buildings

    Tanveer Ahmad, Hongcai Zhang, and Biao Yan. A review on renewable energy and electricity requirement forecasting models for smart grid and buildings. Sustainable Cities and Society, 55:102052, April 2020. ISSN 2210-6707. doi: 10.1016/j.scs.2020.102052

  9. [9]

    Nielsen, Aakash Tripathi, Shamoon Siddiqui, Ravi P

    Sabeen Ahmed, Ian E. Nielsen, Aakash Tripathi, Shamoon Siddiqui, Ravi P. Ramachandran, and Ghulam Rasool. Transformers in Time-Series Analysis: A Tutorial. Circuits, Systems, and Signal Processing, 42(12):7433–7466, December 2023. ISSN 1531-5878. doi: 10.1007/ s00034-023-02454-8

  10. [10]

    Day- ahead industrial load forecasting for electric RTG cranes

    Feras ALASALI, Stephen HABEN, Victor BECERRA, and William HOLDERBAUM. Day- ahead industrial load forecasting for electric RTG cranes. Journal of Modern Power Sys- tems and Clean Energy , 6(2):223–234, March 2018. ISSN 2196-5420. doi: 10.1007/ s40565-018-0394-4

  11. [11]

    Smart-Meter Big Data for Load Forecasting: An Alternative Approach to Clustering

    Negin Alemazkoor, Mazdak Tootkaboni, Roshanak Nateghi, and Arghavan Louhghalam. Smart-Meter Big Data for Load Forecasting: An Alternative Approach to Clustering. IEEE Access, 10:8377–8387, 2022. ISSN 2169-3536. doi: 10.1109/ACCESS.2022.3142680

  12. [12]

    Manar Amayri, Stephane Ploix, Hussain Kazmi, Quoc-Dung Ngo, and E. L. Abed E. L. Safadi. Estimating Occupancy from Measurements and Knowledge Using the Bayesian Network for Energy Management. Journal of Sensors, 2019:1–12, April 2019. ISSN 1687-725X, 1687-

  13. [13]

    URL https://www.hindawi.com/journals/ js/2019/7129872/

    doi: 10.1155/2019/7129872. URL https://www.hindawi.com/journals/ js/2019/7129872/

  14. [14]

    En- ergy Standard for Buildings Except Low-Rise Residential Buildings

    American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE). En- ergy Standard for Buildings Except Low-Rise Residential Buildings. Atlanta, GA, 2013 edition,

  15. [15]

    ANSI/ASHRAE/IES Standard 90.1-2013

  16. [16]

    Bottom-up forecasting: Applica- tions and limitations in load forecasting using smart-meter data

    Harsh Anand, Roshanak Nateghi, and Negin Alemazkoor. Bottom-up forecasting: Applica- tions and limitations in load forecasting using smart-meter data. Data-Centric Engineering, 4: e14, January 2023. ISSN 2632-6736. doi: 10.1017/dce.2023.10

  17. [17]

    Artificial intelligence techniques for enabling Big Data services in distribution networks: A review

    Sara Barja-Martinez, M `onica Arag ¨u´es-Pe˜nalba, ´Ingrid Munn ´e-Collado, Pau Lloret-Gallego, Eduard Bullich-Massagu ´e, and Roberto Villafafila-Robles. Artificial intelligence techniques for enabling Big Data services in distribution networks: A review. Renewable and Sustain- able Energy Reviews, 150:111459, October 2021. ISSN 1364-0321. doi: 10.1016/j...

  18. [18]

    Spatio- Temporal Wind Speed Forecasting using Graph Networks and Novel Transformer Architec- tures

    Lars Ødegaard Bentsen, Narada Dilp Warakagoda, Roy Stenbro, and Paal Engelstad. Spatio- Temporal Wind Speed Forecasting using Graph Networks and Novel Transformer Architec- tures. Applied Energy, 333:120565, March 2023. ISSN 03062619. doi: 10.1016/j.apenergy. 2022.120565

  19. [19]

    K. Berk, A. Hoffmann, and A. M ¨uller. Probabilistic forecasting of industrial electricity load with regime switching behavior. International Journal of Forecasting, 34(2):147–162, April

  20. [20]

    doi: 10.1016/j.ijforecast.2017.09.006

    ISSN 0169-2070. doi: 10.1016/j.ijforecast.2017.09.006

  21. [21]

    Regression Transformer enables concurrent sequence regres- sion and generation for molecular language modelling

    Jannis Born and Matteo Manica. Regression Transformer enables concurrent sequence regres- sion and generation for molecular language modelling. Nature Machine Intelligence , 5(4): 432–444, April 2023. ISSN 2522-5839. doi: 10.1038/s42256-023-00639-z

  22. [22]

    Prognosen Des Leistungsbedarfs Volatiler Energieversorgungsnetze Am Beispiel Elektrischer Bahnen

    Julius Bosch. Prognosen Des Leistungsbedarfs Volatiler Energieversorgungsnetze Am Beispiel Elektrischer Bahnen. Deutscher Industrieverlag, 2017

  23. [23]

    Short-term industrial reactive power forecasting

    Antonio Bracale, Guido Carpinelli, Pasquale De Falco, and Tao Hong. Short-term industrial reactive power forecasting. International Journal of Electrical Power & Energy Systems, 107: 177–185, May 2019. ISSN 0142-0615. doi: 10.1016/j.ijepes.2018.11.022. 20 Under review as a journal paper

  24. [24]

    Modernizing Distribution System Restoration to Achieve Grid Resiliency Against Extreme Weather Events: An Integrated Solution

    Chen Chen, Jianhui Wang, and Dan Ton. Modernizing Distribution System Restoration to Achieve Grid Resiliency Against Extreme Weather Events: An Integrated Solution. Proceed- ings of the IEEE, 105(7):1267–1288, July 2017. ISSN 1558-2256. doi: 10.1109/JPROC.2017. 2684780

  25. [25]

    An agent-based stochastic Occupancy Sim- ulator

    Yixing Chen, Tianzhen Hong, and Xuan Luo. An agent-based stochastic Occupancy Sim- ulator. Building Simulation , 11(1):37–49, February 2018. ISSN 1996-3599, 1996-8744. doi: 10.1007/s12273-017-0379-7. URL http://link.springer.com/10.1007/ s12273-017-0379-7

  26. [26]

    Exogenous Data for Load Forecasting: A Review:

    Ram ´on Christen, Luca Mazzola, Alexander Denzler, and Edy Portmann. Exogenous Data for Load Forecasting: A Review:. In Proceedings of the 12th International Joint Conference on Computational Intelligence, pp. 489–500, Budapest, Hungary, 2020. SCITEPRESS - Science and Technology Publications. ISBN 978-989-758-475-6. doi: 10.5220/0010213204890500

  27. [27]

    Mathur, Rajat Sen, and Rose Yu

    Abhimanyu Das, Weihao Kong, Andrew Leach, Shaan K. Mathur, Rajat Sen, and Rose Yu. Long-term Forecasting with TiDE: Time-series Dense Encoder. Transactions on Machine Learning Research, May 2023. ISSN 2835-8856

  28. [28]

    A decoder-only foundation model for time-series forecasting, February 2024

    Abhimanyu Das, Weihao Kong, Rajat Sen, and Yichen Zhou. A decoder-only foundation model for time-series forecasting, February 2024

  29. [29]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, May 2019

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, May 2019

  30. [30]

    How Decarbonization, Digitalization and Decentralization are changing key power infrastruc- tures

    Maria Luisa Di Silvestre, Salvatore Favuzza, Eleonora Riva Sanseverino, and Gaetano Zizzo. How Decarbonization, Digitalization and Decentralization are changing key power infrastruc- tures. Renewable and Sustainable Energy Reviews , 93:483–498, October 2018. ISSN 1364-

  31. [31]

    doi: 10.1016/j.rser.2018.05.068

  32. [32]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June 2021

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June 2021

  33. [33]

    Cocarascu, F

    Philipp Dufter, Martin Schmitt, and Hinrich Sch ¨utze. Position Information in Transformers: An Overview. Computational Linguistics, 48(3):733–763, September 2022. ISSN 0891-2017. doi: 10.1162/coli a 00445

  34. [35]

    Review of virtual power plant operations: Resource coordination and multidimensional interaction

    Hongchao Gao, Tai Jin, Cheng Feng, Chuyi Li, Qixin Chen, and Chongqing Kang. Review of virtual power plant operations: Resource coordination and multidimensional interaction. Applied Energy, 357:122284, March 2024. ISSN 0306-2619. doi: 10.1016/j.apenergy.2023. 122284

  35. [36]

    Long-Range Transformers for Dynamic Spatiotem- poral Forecasting, May 2022

    Jake Grigsby, Zhe Wang, and Yanjun Qi. Long-Range Transformers for Dynamic Spatiotem- poral Forecasting, May 2022

  36. [37]

    Freight Train Scheduling in Railway Sys- tems

    Rebecca Haehn, Erika ´Abrah´am, and Nils Nießen. Freight Train Scheduling in Railway Sys- tems. In Holger Hermanns (ed.), Measurement, Modelling and Evaluation of Computing Sys- tems, pp. 225–241, Cham, 2020. Springer International Publishing. ISBN 978-3-030-43024-5. doi: 10.1007/978-3-030-43024-5 14

  37. [38]

    Railway crew scheduling: Models, methods and applications

    Julia Heil, Kirsten Hoffmann, and Udo Buscher. Railway crew scheduling: Models, methods and applications. European Journal of Operational Research , 283(2):405–425, June 2020. ISSN 0377-2217. doi: 10.1016/j.ejor.2019.06.016

  38. [39]

    Transformation in the US distributed energy resource market | Wood Mackenzie

    Ben Hertz-Shargel. Transformation in the US distributed energy resource market | Wood Mackenzie. https://www.woodmac.com/news/opinion/transformation-distributed- energy-resource-market/, June 2023. 21 Under review as a journal paper

  39. [40]

    Probabilistic electric load forecasting: A tutorial review

    Tao Hong and Shu Fan. Probabilistic electric load forecasting: A tutorial review. Inter- national Journal of Forecasting , 32(3):914–938, July 2016. ISSN 0169-2070. doi: 10. 1016/j.ijforecast.2015.11.011. URL https://www.sciencedirect.com/science/ article/pii/S0169207015001508

  40. [41]

    En- ergy Forecasting: A Review and Outlook

    Tao Hong, Pierre Pinson, Yi Wang, Rafał Weron, Dazhi Yang, and Hamidreza Zareipour. En- ergy Forecasting: A Review and Outlook. IEEE Open Access Journal of Power and Energy , 7:376–388, 2020. ISSN 2687-7910. doi: 10.1109/OAJPE.2020.3029979

  41. [42]

    Samiul Haque Sunny

    Eklas Hossain, Imtiaj Khan, Fuad Un-Noor, Sarder Shazali Sikander, and Md. Samiul Haque Sunny. Application of Big Data and Machine Learning in Smart Grid, and Associated Security Concerns: A Review. IEEE Access, 7:13960–13988, 2019. ISSN 2169-3536. doi: 10.1109/ ACCESS.2019.2894819

  42. [43]

    Buildings - Energy System, 11 2023

    International Energy Association. Buildings - Energy System, 11 2023. URL https:// www.iea.org/energy-system/buildings

  43. [44]

    Chemformer: A pre- trained transformer for computational chemistry

    Ross Irwin, Spyridon Dimitriadis, Jiazhen He, and Esben Jannik Bjerrum. Chemformer: A pre- trained transformer for computational chemistry. Machine Learning: Science and Technology, 3(1):015022, January 2022. ISSN 2632-2153. doi: 10.1088/2632-2153/ac3ffb

  44. [45]

    Distributed intelligence: A critical piece of the microgrid puzzle

    Dan Jacobson and Larry Dickerman. Distributed intelligence: A critical piece of the microgrid puzzle. The Electricity Journal, 32(5):10–13, June 2019. ISSN 1040-6190. doi: 10.1016/j.tej. 2019.05.001

  45. [46]

    Energy Efficient Integration of Renewable Energy Sources in the Smart Grid for Demand Side Management

    Nadeem Javaid, Ghulam Hafeez, Sohail Iqbal, Nabil Alrajeh, Mohamad Souheil Alabed, and Mohsen Guizani. Energy Efficient Integration of Renewable Energy Sources in the Smart Grid for Demand Side Management. IEEE Access, 6:77077–77096, 2018. ISSN 2169-3536. doi: 10.1109/ACCESS.2018.2866461

  46. [47]

    End-to-end Symbolic Regression with Transformers

    Pierre-alexandre Kamienny, St ´ephane d’Ascoli, Guillaume Lample, and Francois Charton. End-to-end Symbolic Regression with Transformers. Advances in Neural Information Pro- cessing Systems, 35:10269–10281, December 2022

  47. [48]

    Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei

    Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling Laws for Neural Language Models, January 2020

  48. [49]

    Large Scale Production Planning in the Stainless Steel Industry

    Janne Karelahti, Pekka Vainiom¨aki, and Tapio Westerlund. Large Scale Production Planning in the Stainless Steel Industry. Industrial & Engineering Chemistry Research, 50(9):4893–4906, May 2011. ISSN 0888-5885. doi: 10.1021/ie101376b

  49. [50]

    Ten questions concerning data-driven mod- elling and forecasting of operational energy demand at building and urban scale

    Hussain Kazmi, Chun Fu, and Clayton Miller. Ten questions concerning data-driven mod- elling and forecasting of operational energy demand at building and urban scale. Building and Environment, 239:110407, July 2023. ISSN 0360-1323. doi: 10.1016/j.buildenv.2023.110407

  50. [51]

    Electrical load forecasting models: A critical systematic review

    Corentin Kuster, Yacine Rezgui, and Monjur Mourshed. Electrical load forecasting models: A critical systematic review. Sustainable Cities and Society, 35:257–270, November 2017. ISSN 2210-6707. doi: 10.1016/j.scs.2017.08.009

  51. [52]

    Masset, R

    Stephanie Lenhart and Kathleen Ara ´ujo. Microgrid decision-making by public power utilities in the United States: A critical assessment of adoption and technological profiles. Renewable and Sustainable Energy Reviews, 139:110692, April 2021. ISSN 1364-0321. doi: 10.1016/j. rser.2020.110692

  52. [53]

    OpenStudio-Occupant-Variability-Gem v1.0, 2020

    Han Li, Xuan Luo, and Tianzhen Hong. OpenStudio-Occupant-Variability-Gem v1.0, 2020. URL https://www.osti.gov/doecode/biblio/38618

  53. [54]

    A synthetic building operation dataset

    Han Li, Zhe Wang, and Tianzhen Hong. A synthetic building operation dataset. Scientific Data, 8(1):213, August 2021. ISSN 2052-4463. doi: 10.1038/s41597-021-00989-6. URL https://www.nature.com/articles/s41597-021-00989-6

  54. [55]

    Arik, Nicolas Loeff, and Tomas Pfister

    Bryan Lim, Sercan O. Arik, Nicolas Loeff, and Tomas Pfister. Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting, September 2020. 22 Under review as a journal paper

  55. [56]

    Arık, Nicolas Loeff, and Tomas Pfister

    Bryan Lim, Sercan ¨O. Arık, Nicolas Loeff, and Tomas Pfister. Temporal Fusion Transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4):1748–1764, October 2021. ISSN 0169-2070. doi: 10.1016/j.ijforecast.2021.03.012

  56. [57]

    iTransformer: Inverted Transformers Are Effective for Time Series Forecasting, October 2023

    Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting, October 2023

  57. [58]

    L ´opez and P

    G. L ´opez and P. Arboleya. Short-term wind speed forecasting over complex terrain using linear regression models and multivariable LSTM and NARX networks in the Andes Mountains, Ecuador. Renewable Energy, 183:351–368, 2022. ISSN 0960-1481. doi: 10.1016/j.renene. 2021.10.070

  58. [59]

    US Distributed Energy Resource market to almost double by 2027 | Wood Mackenzie

    Wood Mackenzie. US Distributed Energy Resource market to almost double by 2027 | Wood Mackenzie. https://www.woodmac.com/press-releases/us-distributed-energy-resource- market-to-almost-double-by-2027/, June 2023

  59. [60]

    US microgrid market develops at rapid pace, with capacity reaching 10 GW in 2022 | Wood Mackenzie

    Wood Mackenzie. US microgrid market develops at rapid pace, with capacity reaching 10 GW in 2022 | Wood Mackenzie. https://www.woodmac.com/press-releases/us-microgrid-market- develops-at-rapid-pace–with-capacity-reaching-10-gw-in-2022/, February 2023

  60. [61]

    A Review of the Measures to Enhance Power Systems Resilience

    Maedeh Mahzarnia, Mohsen Parsa Moghaddam, Payam Teimourzadeh Baboli, and Pierluigi Siano. A Review of the Measures to Enhance Power Systems Resilience. IEEE Systems Journal, 14(3):4059–4070, September 2020. ISSN 1937-9234. doi: 10.1109/JSYST.2020. 2965993

  61. [62]

    A fuzzy inference model for short- term load forecasting

    Rustum Mamlook, Omar Badran, and Emad Abdulhadi. A fuzzy inference model for short- term load forecasting. Energy Policy, 37(4):1239–1248, April 2009. ISSN 0301-4215. doi: 10.1016/j.enpol.2008.10.051

  62. [63]

    Comparison of machine learning methods for photovoltaic power forecasting based on numerical weather prediction

    D ´avid Markovics and Martin J ´anos Mayer. Comparison of machine learning methods for photovoltaic power forecasting based on numerical weather prediction. Renewable and Sus- tainable Energy Reviews, 161:112364, June 2022. ISSN 1364-0321. doi: 10.1016/j.rser.2022. 112364

  63. [64]

    Deep Learning in Energy Modeling: Application in Smart Buildings With Distributed Energy Generation

    Seyed Azad Nabavi, Naser Hossein Motlagh, Martha Arbayani Zaidan, Alireza Aslani, and Behnam Zakeri. Deep Learning in Energy Modeling: Application in Smart Buildings With Distributed Energy Generation. IEEE Access, 9:125439–125461, 2021. ISSN 2169-3536. doi: 10.1109/ACCESS.2021.3110960

  64. [65]

    Newsham and Benjamin J

    Guy R. Newsham and Benjamin J. Birt. Building-level occupancy data to improve ARIMA- based electricity use forecasts. In Proceedings of the 2nd ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Building , pp. 13–18, Zurich Switzerland, Novem- ber 2010. ACM. ISBN 9781450304580. doi: 10.1145/1878431.1878435. URL https: //dl.acm.org/doi/10.11...

  65. [66]

    Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam

    Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In The Eleventh International Conference on Learning Representations, September 2022

  66. [67]

    Prediction of aggregated power curtail- ment of smart grid demand response of a large number of building air-conditioners

    Chuzo Ninagawa, Shinji Kondo, and Junji Morikawa. Prediction of aggregated power curtail- ment of smart grid demand response of a large number of building air-conditioners. In 2016 International Conference on Industrial Informatics and Computer Systems (CIICS) , pp. 1–4, March 2016. doi: 10.1109/ICCSII.2016.7462441

  67. [68]

    Renewables 2022: Analysis and Forecast to 2027

    OECD. Renewables 2022: Analysis and Forecast to 2027 . Organisation for Economic Co- operation and Development, Paris, 2022

  68. [69]

    Transformers can optimally learn regression mixture models, November 2023

    Reese Pathak, Rajat Sen, Weihao Kong, and Abhimanyu Das. Transformers can optimally learn regression mixture models, November 2023. 23 Under review as a journal paper

  69. [70]

    A comparative assessment of deep learning models for day-ahead load forecasting: Investigating key accuracy drivers

    Sotiris Pelekis, Ioannis-Konstantinos Seisopoulos, Evangelos Spiliotis, Theodosios Pountridis, Evangelos Karakolis, Spiros Mouzakitis, and Dimitris Askounis. A comparative assessment of deep learning models for day-ahead load forecasting: Investigating key accuracy drivers. Sustainable Energy, Grids and Networks , 36:101171, December 2023. ISSN 23524677. ...

  70. [71]

    Aggregated demand- side energy flexibility: A comprehensive review on characterization, forecasting and mar- ket prospects

    Freddy Plaum, Roya Ahmadiahangar, Argo Rosin, and Jako Kilter. Aggregated demand- side energy flexibility: A comprehensive review on characterization, forecasting and mar- ket prospects. Energy Reports , 8:9344–9362, November 2022. ISSN 2352-4847. doi: 10.1016/j.egyr.2022.07.038

  71. [72]

    CatBoost: Unbiased boosting with categorical features

    Liudmila Prokhorenkova, Gleb Gusev, Aleksandr V orobev, Anna Veronika Dorogush, and An- drey Gulin. CatBoost: Unbiased boosting with categorical features. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018

  72. [73]

    Short-term load forecasting based on CEEM- DAN and Transformer

    Peng Ran, Kun Dong, Xu Liu, and Jing Wang. Short-term load forecasting based on CEEM- DAN and Transformer. Electric Power Systems Research, 214:108885, January 2023. ISSN 0378-7796. doi: 10.1016/j.epsr.2022.108885

  73. [74]

    Domestic electricity use: A high-resolution energy demand model

    Ian Richardson, Murray Thomson, David Infield, and Conor Clifford. Domestic electricity use: A high-resolution energy demand model. Energy and Buildings, 42(10):1878–1887, October

  74. [75]

    doi: 10.1016/j.enbuild.2010.05.023

    ISSN 0378-7788. doi: 10.1016/j.enbuild.2010.05.023

  75. [76]

    Saleh, Asmaa H

    Ahmed I. Saleh, Asmaa H. Rabie, and Khaled M. Abo-Al-Ez. A data mining based load forecasting strategy for smart electrical grids. Advanced Engineering Informatics, 30(3):422– 448, August 2016. ISSN 1474-0346. doi: 10.1016/j.aei.2016.05.005

  76. [77]

    Photovoltaic power forecast based on satellite images considering effects of solar position

    Zhiyuan Si, Ming Yang, Yixiao Yu, and Tingting Ding. Photovoltaic power forecast based on satellite images considering effects of solar position. Applied Energy, 302:117514, November

  77. [78]

    doi: 10.1016/j.apenergy.2021.117514

    ISSN 0306-2619. doi: 10.1016/j.apenergy.2021.117514

  78. [79]

    The Performance of LSTM and BiLSTM in Forecasting Time Series

    Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. The Performance of LSTM and BiLSTM in Forecasting Time Series. In2019 IEEE International Conference on Big Data (Big Data), pp. 3285–3292, Los Angeles, CA, USA, December 2019. IEEE. ISBN 978-1-72810- 858-2. doi: 10.1109/BigData47090.2019.9005997

  79. [80]

    Bessa, Jethro Browell, and Pierre Pinson

    Conor Sweeney, Ricardo J. Bessa, Jethro Browell, and Pierre Pinson. The future of forecasting for renewable energy. WIREs Energy and Environment, 9(2):e365, 2020. ISSN 2041-840X. doi: 10.1002/wene.365

  80. [81]

    Establishment of Enhanced Load Modeling by Correlating With Occupancy Informa- tion

    Yachen Tang, Shuaidong Zhao, Chee-Wooi Ten, Kuilin Zhang, and Thillainathan Logenthi- ran. Establishment of Enhanced Load Modeling by Correlating With Occupancy Informa- tion. IEEE Transactions on Smart Grid , 11(2):1702–1713, March 2020. ISSN 1949-3053, 1949-3061. doi: 10.1109/TSG.2019.2942581. URL https://ieeexplore.ieee.org/ document/8844854/

Showing first 80 references.