A Hybrid LSTM--Vision Transformer Architecture for Predicting HRRR Forecast Errors

Chris D. Thorncroft; David Aaron Evans; Jay C. Rothenberger; Kara J. Sulia; Nick P. Bassill

arxiv: 2606.19026 · v1 · pith:LFFAVRV3new · submitted 2026-06-17 · 💻 cs.LG · cs.AI· physics.ao-ph

A Hybrid LSTM--Vision Transformer Architecture for Predicting HRRR Forecast Errors

David Aaron Evans , Jay C. Rothenberger , Kara J. Sulia , Nick P. Bassill , Chris D. Thorncroft This is my paper

Pith reviewed 2026-06-26 21:20 UTC · model grok-4.3

classification 💻 cs.LG cs.AIphysics.ao-ph

keywords forecast error predictionHRRRLSTMVision Transformerplanetary boundary layerprecipitation forecastmesonet profilerhybrid architecture

0 comments

The pith

A hybrid LSTM-Vision Transformer improves HRRR forecast error predictions by incorporating vertical atmospheric profiles from profilers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a hybrid LSTM-Vision Transformer that combines temporal learning from surface mesonet observations with vertical profiles to predict errors in HRRR forecasts of precipitation, 10 m wind speed, and 2 m temperature. Adding the profiler data raises skill over a pure LSTM baseline for all three variables, with the largest gains at shorter lead times and during enhanced PBL activity. The improvement reaches roughly twofold for precipitation error prediction and reduces degradation tied to convective processes. The work shows that vertically informed attention supplies a route to better error forecasts in high-resolution NWP.

Core claim

Incorporation of profiler-derived atmospheric structure improves forecast error prediction skill relative to the baseline LSTM architecture, with the largest gains occurring at shorter forecast lead times and during periods of enhanced PBL activity; for precipitation the LSTM-ViT framework achieves approximately a twofold increase in predictive skill while better capturing convectively driven error evolution.

What carries the argument

The hybrid LSTM-Vision Transformer that fuses temporal sequence learning from surface observations with vertically informed attention mechanisms applied to atmospheric profiles.

If this is right

Forecast error prediction skill increases most at short lead times when vertical structure is supplied.
Precipitation error forecasts show the largest relative gain and better track convective error sources.
Degradation during enhanced PBL activity is reduced across temperature, wind, and precipitation predictions.
The combined architecture supplies physically interpretable guidance on model bias for operational use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same profiler-augmented approach could be tested on other high-resolution NWP models that share similar PBL and convection error patterns.
Attention weights might be inspected post-training to identify which vertical levels most influence error predictions during convective events.
Extending the framework to additional surface variables or to regions with sparser profiler coverage would test whether the vertical information remains the dominant driver of gains.

Load-bearing premise

The observed skill gains result from the vertical attention mechanisms capturing PBL and convective processes rather than from added model capacity or dataset effects.

What would settle it

An experiment that matches total parameter count between the hybrid model and baseline LSTM but removes the profiler input, then measures whether the skill advantage disappears.

Figures

Figures reproduced from arXiv: 2606.19026 by Chris D. Thorncroft, David Aaron Evans, Jay C. Rothenberger, Kara J. Sulia, Nick P. Bassill.

**Figure 2.** Figure 2: This graphic illustrates the LSTM+ViT encoder–decoder workflow at a [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: The structure of the Vision Transformer (ViT) encoder unit as imple [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Scatterplot of the precipitation error across the NYSM network and all [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 6.** Figure 6: Although the LSTM-ViT model more effectively captures both positive and [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 5.** Figure 5: Confusion matrix summarizing the precision of Hybrid predictions for [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

**Figure 6.** Figure 6: New York State MAE overlaid by NCEI climate division (NCEI, 2015). [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

**Figure 7.** Figure 7: From top to bottom, panels show aggregate RMSE in mmhr [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

**Figure 8.** Figure 8: NYSM, MAE of LSTM-ViT precipitation-error predictions in mmhr [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

**Figure 9.** Figure 9: Scatterplot of the wind error across the NYSM network and all forecast [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗

**Figure 10.** Figure 10: New York State MAE overlaid by NCEI climate division (NCEI, 2015). [PITH_FULL_IMAGE:figures/full_fig_p025_10.png] view at source ↗

**Figure 11.** Figure 11: From top to bottom, panels show aggregate RMSE in m s [PITH_FULL_IMAGE:figures/full_fig_p026_11.png] view at source ↗

**Figure 12.** Figure 12: NYSM, MAE of LSTM-ViT wind-error predictions in m s [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗

**Figure 13.** Figure 13: Scatterplot of the temperature error across the NYSM network and [PITH_FULL_IMAGE:figures/full_fig_p030_13.png] view at source ↗

**Figure 14.** Figure 14: New York State MAE overlaid by NCEI climate division (NCEI, 2015). [PITH_FULL_IMAGE:figures/full_fig_p031_14.png] view at source ↗

**Figure 15.** Figure 15: From top to bottom, panels show aggregate RMSE in [PITH_FULL_IMAGE:figures/full_fig_p032_15.png] view at source ↗

**Figure 16.** Figure 16: NYSM, MAE of LSTM-ViT temperature-error predictions in [PITH_FULL_IMAGE:figures/full_fig_p033_16.png] view at source ↗

read the original abstract

Forecast errors in high-resolution numerical weather prediction (NWP) systems are often linked to unresolved planetary boundary layer (PBL) processes, convection, terrain-induced circulations, and other vertically structured atmospheric phenomena. Previous work demonstrated that Long Short-Term Memory (LSTM) networks can successfully predict forecast errors in the High-Resolution Rapid Refresh (HRRR) model using mesonet observations, but we believe performance degradation is linked to periods of complex vertical atmospheric evolution. To address this limitation, we develop a hybrid LSTM-Vision Transformer (LSTM-ViT) framework that combines temporal sequence learning from surface observations with atmospheric profiles from the New York State Mesonet profiler network. The LSTM-ViT framework is trained to predict HRRR hourly precipitation, 10 m wind speed, and 2 m temperature forecast errors at individual mesonet stations. Across all three predictors, incorporation of profiler-derived atmospheric structure improves forecast error prediction skill relative to the baseline LSTM architecture, with the largest gains occurring at shorter forecast lead times and during periods of enhanced PBL activity. Improvements are particularly pronounced for precipitation forecast error, where the LSTM-ViT framework achieves approximately a twofold increase in predictive skill relative to the baseline LSTM while better capturing convectively driven error evolution and reducing degradation associated with PBL processes. These results demonstrate that combining temporal sequence learning with vertically informed attention mechanisms provides a physically meaningful pathway for improving forecast error prediction in operational NWP systems. Our research offers forecasters enhanced guidance regarding model bias and forecast confidence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The hybrid LSTM-ViT beats the plain LSTM baseline on HRRR error prediction using vertical profiles, but the gains are not isolated from added capacity or input volume.

read the letter

The paper puts forward a hybrid LSTM-ViT model that takes surface time series plus vertical profiler profiles from the New York State Mesonet to predict hourly HRRR forecast errors in precipitation, 10 m wind, and 2 m temperature. It reports that adding the profiler data lifts skill over a baseline LSTM, with the biggest lift on precipitation (roughly twofold) and during PBL-active periods.

What is actually new is the specific pairing of LSTM temporal modeling on surface observations with ViT attention over vertical profiles for this exact error-correction task. Prior LSTM work on HRRR errors is cited, so the extension is clear.

The work is grounded in real mesonet observations and targets a practical operational need in NWP post-processing. That focus is useful for forecasters who need bias guidance.

The soft spots are in the controls. The comparison is only to the baseline LSTM; there are no parameter counts, no capacity-matched LSTM variants, and no ablation that removes the ViT or the profiler inputs while keeping total capacity fixed. The link to PBL activity comes from post-hoc stratification rather than a pre-specified test. The abstract gives no error bars, significance tests, or train-test split details, so the quantitative claims rest on unverified numbers. Hyperparameter tuning on the evaluation dataset adds the usual circularity risk.

This paper is for groups already working on ML corrections inside operational weather models. A reader who needs a concrete architecture for vertical structure in time-series error prediction could extract the method. It is not a broad methodological advance.

I would send it to peer review once the authors add capacity-matched controls and basic statistical reporting; without those it is still worth referee time for the application domain.

Referee Report

3 major / 1 minor

Summary. The paper introduces a hybrid LSTM-Vision Transformer (LSTM-ViT) model that fuses temporal learning from surface mesonet observations with vertical atmospheric profiles from the New York State Mesonet profiler network to predict HRRR forecast errors in hourly precipitation, 10 m wind speed, and 2 m temperature. It reports that adding profiler-derived structure improves skill over a baseline LSTM across all three variables, with the largest gains at short lead times and during enhanced PBL activity; precipitation error prediction shows an approximately twofold skill increase while better capturing convective error evolution.

Significance. If the skill gains are shown to arise specifically from the vertically informed attention rather than capacity or data-volume effects, the work would offer a concrete, physically grounded route to reduce NWP error prediction degradation during complex PBL and convective regimes, with potential value for operational forecast guidance.

major comments (3)

[Results / experimental design] The central attribution of the reported twofold precipitation skill gain and reduced PBL degradation to the LSTM-ViT's vertically informed attention (abstract and results) rests on a comparison solely to an untuned baseline LSTM; no parameter counts, FLOPs, or capacity-matched controls (e.g., deeper LSTM or LSTM with duplicated surface inputs) are described, leaving open the possibility that gains reflect increased model expressivity or input richness rather than the ViT mechanism.
[Results] Post-hoc stratification on PBL-active periods (abstract) introduces selection dependence; without a pre-specified ablation that isolates the profiler profiles or ViT encoder on the full dataset, the mechanistic link between attention on vertical structure and the observed improvements cannot be isolated from dataset-specific effects.
[Abstract / Results] No error bars, statistical significance tests, or explicit train-test split details are provided for the quantitative claims (abstract), weakening the reliability of the reported skill increases.

minor comments (1)

[Methods] Hyperparameter tuning details and the exact definition of 'predictive skill' (e.g., which metric yields the twofold improvement) should be stated explicitly to allow reproduction.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of experimental design and statistical rigor. We address each major comment below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Results / experimental design] The central attribution of the reported twofold precipitation skill gain and reduced PBL degradation to the LSTM-ViT's vertically informed attention (abstract and results) rests on a comparison solely to an untuned baseline LSTM; no parameter counts, FLOPs, or capacity-matched controls (e.g., deeper LSTM or LSTM with duplicated surface inputs) are described, leaving open the possibility that gains reflect increased model expressivity or input richness rather than the ViT mechanism.

Authors: We agree that the current baseline comparison does not fully isolate the contribution of the vertically informed attention mechanism from potential effects of model capacity or input richness. In the revised manuscript we will report parameter counts and FLOPs for the LSTM-ViT and baseline LSTM, and we will add a capacity-matched control experiment (e.g., a deeper LSTM or an LSTM receiving duplicated surface inputs). These additions will allow a clearer attribution of skill gains to the ViT component. revision: yes
Referee: [Results] Post-hoc stratification on PBL-active periods (abstract) introduces selection dependence; without a pre-specified ablation that isolates the profiler profiles or ViT encoder on the full dataset, the mechanistic link between attention on vertical structure and the observed improvements cannot be isolated from dataset-specific effects.

Authors: We acknowledge that the PBL-active stratification was performed post-hoc. To address this, the revised manuscript will include a pre-specified ablation study performed on the full dataset that isolates the contribution of the profiler profiles and the ViT encoder. This will provide a more rigorous test of the mechanistic role of vertical structure. revision: yes
Referee: [Abstract / Results] No error bars, statistical significance tests, or explicit train-test split details are provided for the quantitative claims (abstract), weakening the reliability of the reported skill increases.

Authors: We agree that the absence of error bars, significance testing, and explicit train-test split information limits the strength of the quantitative claims. In the revision we will add error bars to all reported metrics, conduct appropriate statistical significance tests, and provide full details of the train-test split procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical ML training and held-out evaluation

full rationale

The paper trains LSTM and LSTM-ViT models on mesonet surface and profiler data to predict HRRR forecast errors for precipitation, wind, and temperature, then reports skill metrics on held-out data. The central results (skill gains, especially for precipitation at short leads and PBL-active periods) are obtained via standard supervised training and test-set evaluation rather than any derivation that reduces by construction to fitted parameters or self-citations. No equations, uniqueness theorems, or ansatzes are invoked that collapse the claimed improvements to the inputs; the comparison to baseline LSTM is an external empirical benchmark. Hyperparameter tuning on the dataset is standard practice and does not constitute circularity under the defined patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim depends on standard supervised learning assumptions plus one domain assumption about profiler data; no new physical entities are postulated and the number of explicit free parameters beyond routine hyperparameters is low.

free parameters (1)

LSTM and ViT architecture hyperparameters
Model depth, attention heads, hidden sizes, and training hyperparameters are selected and optimized on the training data to produce the reported skill gains.

axioms (1)

domain assumption Profiler vertical profiles from the New York State Mesonet accurately capture the atmospheric structure that drives HRRR forecast errors at surface stations.
Invoked when attributing skill gains to incorporation of profiler-derived structure and when linking gains to PBL activity.

pith-pipeline@v0.9.1-grok · 5820 in / 1384 out tokens · 26731 ms · 2026-06-26T21:20:44.690380+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 21 canonical work pages

[1]

Horton, 2023: New york state climate change projections methodology report

Bader, D., and R. Horton, 2023: New york state climate change projections methodology report. Technical report, new york state climate impacts assessment, Columbia University, Lamont-Doherty Earth Observatory, Columbia Climate School. Prepared for the New York State Climate Impacts Assessment

2023
[2]

M., & Bishop, H

Bishop, C. M., and H. Bishop, 2023: Deep Learning: Foundations and Concepts. 1st ed., Springer Cham, 649 pp., doi:https://doi.org/10.1007/978-3-031-45468-4, ://doi.org/10.1007/978-3-031-45468-4, 200 b/w illustrations, 400 illustrations in colour

work page doi:10.1007/978-3-031-45468-4 2023
[3]

Blaylock, B. K., J. D. Horel, and S. T. Liston, 2017: Cloud archiving and data mining of high-resolution rapid refresh forecast model output. Computers & Geosciences, 109, 43--50, doi:10.1016/j.cageo.2017.08.005

work page doi:10.1016/j.cageo.2017.08.005 2017
[4]

A., and Coauthors, 2020: A technical overview of the new york state mesonet standard network

Brotzge, J. A., and Coauthors, 2020: A technical overview of the new york state mesonet standard network. Journal of Atmospheric and Oceanic Technology, 37, 1827--1845, doi:10.1175/JTECH-D-19-0220.1

work page doi:10.1175/jtech-d-19-0220.1 2020
[5]

S., and W

Campbell, L. S., and W. J. Steenburgh, 2017: The owles iop2b lake-effect snowstorm: Mechanisms contributing to the tug hill precipitation maximum. Monthly Weather Review, 145, 2461--2478, doi:10.1175/MWR-D-16-0460.1

work page doi:10.1175/mwr-d-16-0460.1 2017
[6]

Clare, M. C. A., M. Sonnewald, R. Lguensat, J. Deshayes, and V. Balaji, 2022: Explainable artificial intelligence for bayesian neural networks: Toward trustworthy predictions of ocean dynamics. Journal of Advances in Modeling Earth Systems, 14, e2022MS003\,162, doi:10.1029/2022MS003162

work page doi:10.1029/2022ms003162 2022
[7]

International Conference on Learning Representations (ICLR), ://openreview.net/forum?id=YicbFdNTTy

Dosovitskiy, A., and Coauthors, 2021: An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), ://openreview.net/forum?id=YicbFdNTTy

2021
[8]

C., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model

Dowell, D. C., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model. part i: Motivation and system description. Weather and Forecasting, 37, 1371--1395, doi:10.1175/WAF-D-21-0151.1

work page doi:10.1175/waf-d-21-0151.1 2022
[9]

Evans, D. A., K. J. Sulia, N. P. Bassill, C. D. Thorncroft, J. C. Rothenberger, and L. C. Gaudet, 2025: Predicting forecast error for the hrrr using lstm neural networks: A comparative study using new york and oklahoma state mesonets. ://arxiv.org/abs/2512.14898, 2512.14898

Pith/arXiv arXiv 2025
[10]

Gagne, D. J., A. McGovern, S. E. Haupt, R. A. Sobash, J. K. Williams, and M. Xue, 2017: Storm-based probabilistic hail forecasting with machine learning applied to convection-allowing ensembles. Weather and Forecasting, 32, 1819--1840, doi:10.1175/WAF-D-17-0010.1

work page doi:10.1175/waf-d-17-0010.1 2017
[11]

Gaudet, L. C., K. J. Sulia, R. D. Torn, and N. P. Bassill, 2024: Verification of the global forecast system, north american mesoscale forecast system, and high-resolution rapid refresh model near-surface forecasts by use of the new york state mesonet. Weather and Forecasting, 39, 369--386, doi:10.1175/WAF-D-23-0094.1

work page doi:10.1175/waf-d-23-0094.1 2024
[12]

Long short-term memory

Hochreiter, S., and J. Schmidhuber, 1997: Long short-term memory. Neural Computation, 9, 1735--1780, doi:10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997
[13]

P., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model

James, E. P., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model. part ii: Forecast performance. Weather and Forecasting, 37 (8), 1397--1417, doi:10.1175/waf-d-21-0130.1

work page doi:10.1175/waf-d-21-0130.1 2022
[14]

Learning skillful medium-range global weather forecasting

Lam, R., and Coauthors, 2023: Learning skillful medium-range global weather forecasting. Science, 382 (6677), 1416--1421, doi:10.1126/science.adi2336

work page doi:10.1126/science.adi2336 2023
[15]

ArXiv, 2406.01465

Lang, S., and Coauthors, 2024: Aifs -- ecmwf's data-driven forecasting system. ArXiv, 2406.01465

arXiv 2024
[16]

Bulletin of the American Meteorological Society, 98, 1349--1361, doi:10.1175/BAMS-D-15-00258.1

Mahmood, R., and Coauthors, 2017: Mesonets: Mesoscale weather and climate observations for the united states. Bulletin of the American Meteorological Society, 98, 1349--1361, doi:10.1175/BAMS-D-15-00258.1

work page doi:10.1175/bams-d-15-00258.1 2017
[17]

Christensen, 2026: Epistemic and aleatoric uncertainty quantification in weather and climate models

Mansfield, L., and H. Christensen, 2026: Epistemic and aleatoric uncertainty quantification in weather and climate models. Quarterly Journal of the Royal Meteorological Society, doi:10.1002/qj.70219

work page doi:10.1002/qj.70219 2026
[18]

McGovern, A., K. L. Elmore, D. J. Gagne, S. E. Haupt, C. D. Karstens, R. Lagerquist, T. Smith, and J. K. Williams, 2017: Using artificial intelligence to improve real-time decision-making for high-impact weather. Bulletin of the American Meteorological Society, 98, 2073--2090, doi:10.1175/BAMS-D-16-0123.1

work page doi:10.1175/bams-d-16-0123.1 2017
[19]

Salmun, and A

Molod, A., H. Salmun, and A. B. Marquardt Collow, 2019: Annual cycle of planetary boundary layer heights estimated from wind profiler network data. Journal of Geophysical Research: Atmospheres, 124 (12), 6207--6221, doi:10.1029/2018JD030102

work page doi:10.1029/2018jd030102 2019
[20]

://rapidrefresh.noaa.gov/hrrr/, accessed: 1 Apr

National Centers for Environmental Prediction , 2024: High-resolution rapid refresh (hrrr) model. ://rapidrefresh.noaa.gov/hrrr/, accessed: 1 Apr. 2025

2024
[21]

climate divisions

NCEI , 2015: U.s. climate divisions. Accessed: 2023-08-03, https://www.ncei.noaa.gov/access/monitoring/dyk/us-climate-divisions

2015
[22]

Accessed: 2025-12-09, https://madis.ncep.noaa.gov/mesonet_providers.shtml

NOAA/NCEP MADIS , 2021: Madis meteorological surface data providers. Accessed: 2025-12-09, https://madis.ncep.noaa.gov/mesonet_providers.shtml

2021
[23]

Stephan Rasp, Stephan Hoyer, Aravind Merose, Johannes Langguth, Sebastian Deiser, et al

Rasp, S., and Coauthors, 2024: Weatherbench 2: A benchmark for the next generation of data-driven global weather models. Journal of Advances in Modeling Earth Systems, 16 (6), e2023MS004\,019, doi:10.1029/2023MS004019

work page doi:10.1029/2023ms004019 2024
[24]

Shrestha, B., J. A. Brotzge, and J. Wang, 2022: Evaluation of the new york state mesonet profiler network data. Atmospheric Measurement Techniques, 15, 6011--6033, doi:10.5194/amt-15-6011-2022

work page doi:10.5194/amt-15-6011-2022 2022
[25]

Shrestha, B., J. A. Brotzge, J. Wang, N. Bain, C. D. Thorncroft, E. Joseph, J. Freedman, and S. Perez, 2021: Overview and applications of the new york state mesonet profiler network. Journal of Applied Meteorology and Climatology, 60, 1591--1611, doi:10.1175/JAMC-D-21-0104.1

work page doi:10.1175/jamc-d-21-0104.1 2021
[26]

Swain, M., J. C. Peña, R. Bornstein, and J. Gonzalez, 2025: Coastal and anthropogenic heat impacts on pbl processes during extreme summer thunderstorm precipitation in new york city. Urban Climate, 62, doi:10.1016/j.uclim.2025.102534

work page doi:10.1016/j.uclim.2025.102534 2025
[27]

Tang, S., C. Li, P. Zhang, and R. Tang, 2023: Swinlstm: Improving spatiotemporal prediction accuracy using swin transformer and lstm. 13424-13433 pp., doi:10.1109/ICCV51070.2023.01239

work page doi:10.1109/iccv51070.2023.01239 2023
[28]

Shazeer, N

Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, 2017: Attention is all you need. Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., Curran Associates, Inc., Vol. 30, ://proceedings.neurips.cc/paper_fil...

2017
[29]

Journal of Energy Research and Reviews, 17 (6), 71--87, doi:10.9734/jenrr/2025/v17i6423

Zhang, Y., 2025: Application of lstm and transformer hybrid model for electricity consumption forecasting. Journal of Energy Research and Reviews, 17 (6), 71--87, doi:10.9734/jenrr/2025/v17i6423

work page doi:10.9734/jenrr/2025/v17i6423 2025

[1] [1]

Horton, 2023: New york state climate change projections methodology report

Bader, D., and R. Horton, 2023: New york state climate change projections methodology report. Technical report, new york state climate impacts assessment, Columbia University, Lamont-Doherty Earth Observatory, Columbia Climate School. Prepared for the New York State Climate Impacts Assessment

2023

[2] [2]

M., & Bishop, H

Bishop, C. M., and H. Bishop, 2023: Deep Learning: Foundations and Concepts. 1st ed., Springer Cham, 649 pp., doi:https://doi.org/10.1007/978-3-031-45468-4, ://doi.org/10.1007/978-3-031-45468-4, 200 b/w illustrations, 400 illustrations in colour

work page doi:10.1007/978-3-031-45468-4 2023

[3] [3]

Blaylock, B. K., J. D. Horel, and S. T. Liston, 2017: Cloud archiving and data mining of high-resolution rapid refresh forecast model output. Computers & Geosciences, 109, 43--50, doi:10.1016/j.cageo.2017.08.005

work page doi:10.1016/j.cageo.2017.08.005 2017

[4] [4]

A., and Coauthors, 2020: A technical overview of the new york state mesonet standard network

Brotzge, J. A., and Coauthors, 2020: A technical overview of the new york state mesonet standard network. Journal of Atmospheric and Oceanic Technology, 37, 1827--1845, doi:10.1175/JTECH-D-19-0220.1

work page doi:10.1175/jtech-d-19-0220.1 2020

[5] [5]

S., and W

Campbell, L. S., and W. J. Steenburgh, 2017: The owles iop2b lake-effect snowstorm: Mechanisms contributing to the tug hill precipitation maximum. Monthly Weather Review, 145, 2461--2478, doi:10.1175/MWR-D-16-0460.1

work page doi:10.1175/mwr-d-16-0460.1 2017

[6] [6]

Clare, M. C. A., M. Sonnewald, R. Lguensat, J. Deshayes, and V. Balaji, 2022: Explainable artificial intelligence for bayesian neural networks: Toward trustworthy predictions of ocean dynamics. Journal of Advances in Modeling Earth Systems, 14, e2022MS003\,162, doi:10.1029/2022MS003162

work page doi:10.1029/2022ms003162 2022

[7] [7]

International Conference on Learning Representations (ICLR), ://openreview.net/forum?id=YicbFdNTTy

Dosovitskiy, A., and Coauthors, 2021: An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), ://openreview.net/forum?id=YicbFdNTTy

2021

[8] [8]

C., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model

Dowell, D. C., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model. part i: Motivation and system description. Weather and Forecasting, 37, 1371--1395, doi:10.1175/WAF-D-21-0151.1

work page doi:10.1175/waf-d-21-0151.1 2022

[9] [9]

Evans, D. A., K. J. Sulia, N. P. Bassill, C. D. Thorncroft, J. C. Rothenberger, and L. C. Gaudet, 2025: Predicting forecast error for the hrrr using lstm neural networks: A comparative study using new york and oklahoma state mesonets. ://arxiv.org/abs/2512.14898, 2512.14898

Pith/arXiv arXiv 2025

[10] [10]

Gagne, D. J., A. McGovern, S. E. Haupt, R. A. Sobash, J. K. Williams, and M. Xue, 2017: Storm-based probabilistic hail forecasting with machine learning applied to convection-allowing ensembles. Weather and Forecasting, 32, 1819--1840, doi:10.1175/WAF-D-17-0010.1

work page doi:10.1175/waf-d-17-0010.1 2017

[11] [11]

Gaudet, L. C., K. J. Sulia, R. D. Torn, and N. P. Bassill, 2024: Verification of the global forecast system, north american mesoscale forecast system, and high-resolution rapid refresh model near-surface forecasts by use of the new york state mesonet. Weather and Forecasting, 39, 369--386, doi:10.1175/WAF-D-23-0094.1

work page doi:10.1175/waf-d-23-0094.1 2024

[12] [12]

Long short-term memory

Hochreiter, S., and J. Schmidhuber, 1997: Long short-term memory. Neural Computation, 9, 1735--1780, doi:10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997

[13] [13]

P., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model

James, E. P., and Coauthors, 2022: The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model. part ii: Forecast performance. Weather and Forecasting, 37 (8), 1397--1417, doi:10.1175/waf-d-21-0130.1

work page doi:10.1175/waf-d-21-0130.1 2022

[14] [14]

Learning skillful medium-range global weather forecasting

Lam, R., and Coauthors, 2023: Learning skillful medium-range global weather forecasting. Science, 382 (6677), 1416--1421, doi:10.1126/science.adi2336

work page doi:10.1126/science.adi2336 2023

[15] [15]

ArXiv, 2406.01465

Lang, S., and Coauthors, 2024: Aifs -- ecmwf's data-driven forecasting system. ArXiv, 2406.01465

arXiv 2024

[16] [16]

Bulletin of the American Meteorological Society, 98, 1349--1361, doi:10.1175/BAMS-D-15-00258.1

Mahmood, R., and Coauthors, 2017: Mesonets: Mesoscale weather and climate observations for the united states. Bulletin of the American Meteorological Society, 98, 1349--1361, doi:10.1175/BAMS-D-15-00258.1

work page doi:10.1175/bams-d-15-00258.1 2017

[17] [17]

Christensen, 2026: Epistemic and aleatoric uncertainty quantification in weather and climate models

Mansfield, L., and H. Christensen, 2026: Epistemic and aleatoric uncertainty quantification in weather and climate models. Quarterly Journal of the Royal Meteorological Society, doi:10.1002/qj.70219

work page doi:10.1002/qj.70219 2026

[18] [18]

McGovern, A., K. L. Elmore, D. J. Gagne, S. E. Haupt, C. D. Karstens, R. Lagerquist, T. Smith, and J. K. Williams, 2017: Using artificial intelligence to improve real-time decision-making for high-impact weather. Bulletin of the American Meteorological Society, 98, 2073--2090, doi:10.1175/BAMS-D-16-0123.1

work page doi:10.1175/bams-d-16-0123.1 2017

[19] [19]

Salmun, and A

Molod, A., H. Salmun, and A. B. Marquardt Collow, 2019: Annual cycle of planetary boundary layer heights estimated from wind profiler network data. Journal of Geophysical Research: Atmospheres, 124 (12), 6207--6221, doi:10.1029/2018JD030102

work page doi:10.1029/2018jd030102 2019

[20] [20]

://rapidrefresh.noaa.gov/hrrr/, accessed: 1 Apr

National Centers for Environmental Prediction , 2024: High-resolution rapid refresh (hrrr) model. ://rapidrefresh.noaa.gov/hrrr/, accessed: 1 Apr. 2025

2024

[21] [21]

climate divisions

NCEI , 2015: U.s. climate divisions. Accessed: 2023-08-03, https://www.ncei.noaa.gov/access/monitoring/dyk/us-climate-divisions

2015

[22] [22]

Accessed: 2025-12-09, https://madis.ncep.noaa.gov/mesonet_providers.shtml

NOAA/NCEP MADIS , 2021: Madis meteorological surface data providers. Accessed: 2025-12-09, https://madis.ncep.noaa.gov/mesonet_providers.shtml

2021

[23] [23]

Stephan Rasp, Stephan Hoyer, Aravind Merose, Johannes Langguth, Sebastian Deiser, et al

Rasp, S., and Coauthors, 2024: Weatherbench 2: A benchmark for the next generation of data-driven global weather models. Journal of Advances in Modeling Earth Systems, 16 (6), e2023MS004\,019, doi:10.1029/2023MS004019

work page doi:10.1029/2023ms004019 2024

[24] [24]

Shrestha, B., J. A. Brotzge, and J. Wang, 2022: Evaluation of the new york state mesonet profiler network data. Atmospheric Measurement Techniques, 15, 6011--6033, doi:10.5194/amt-15-6011-2022

work page doi:10.5194/amt-15-6011-2022 2022

[25] [25]

Shrestha, B., J. A. Brotzge, J. Wang, N. Bain, C. D. Thorncroft, E. Joseph, J. Freedman, and S. Perez, 2021: Overview and applications of the new york state mesonet profiler network. Journal of Applied Meteorology and Climatology, 60, 1591--1611, doi:10.1175/JAMC-D-21-0104.1

work page doi:10.1175/jamc-d-21-0104.1 2021

[26] [26]

Swain, M., J. C. Peña, R. Bornstein, and J. Gonzalez, 2025: Coastal and anthropogenic heat impacts on pbl processes during extreme summer thunderstorm precipitation in new york city. Urban Climate, 62, doi:10.1016/j.uclim.2025.102534

work page doi:10.1016/j.uclim.2025.102534 2025

[27] [27]

Tang, S., C. Li, P. Zhang, and R. Tang, 2023: Swinlstm: Improving spatiotemporal prediction accuracy using swin transformer and lstm. 13424-13433 pp., doi:10.1109/ICCV51070.2023.01239

work page doi:10.1109/iccv51070.2023.01239 2023

[28] [28]

Shazeer, N

Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, 2017: Attention is all you need. Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., Curran Associates, Inc., Vol. 30, ://proceedings.neurips.cc/paper_fil...

2017

[29] [29]

Journal of Energy Research and Reviews, 17 (6), 71--87, doi:10.9734/jenrr/2025/v17i6423

Zhang, Y., 2025: Application of lstm and transformer hybrid model for electricity consumption forecasting. Journal of Energy Research and Reviews, 17 (6), 71--87, doi:10.9734/jenrr/2025/v17i6423

work page doi:10.9734/jenrr/2025/v17i6423 2025