Understanding Key Features of Time Series Foundation Models from Epidemic Forecasting

Alireza Jafari; Aniruddha Adiga; Geoffrey C. Fox; Judy Fox; Madhav Marathe

arxiv: 2606.19560 · v1 · pith:PMOOIUUNnew · submitted 2026-06-17 · 💻 cs.LG

Understanding Key Features of Time Series Foundation Models from Epidemic Forecasting

Alireza Jafari , Judy Fox , Geoffrey C. Fox , Madhav Marathe , Aniruddha Adiga This is my paper

Pith reviewed 2026-06-26 20:54 UTC · model grok-4.3

classification 💻 cs.LG

keywords time series forecastingepidemic modelinginfluenza predictionfoundation modelsmixture of expertspretrained modelshospitalization dataspatial generalization

0 comments

The pith

A mixture-of-experts model fusing multiple pretrained forecasters delivers the strongest performance on influenza epidemic time series.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates classical neural networks, numerical transformers, pretrained time series foundation models, and LLM-based methods on regional influenza-like illness and hospitalization data for one- to four-week-ahead forecasts under temporal and spatial generalization. It establishes that the mixture-of-experts fusion of several pretrained forecasters yields the best overall accuracy, showing that different pretraining sources supply complementary signals about epidemic spread. Pretraining yields its clearest benefits at longer horizons when the source domain aligns mechanistically with influenza dynamics, while LLM approaches lag behind numerical forecasters. Hospitalization data adds value as an auxiliary input or pretraining source in selected cases. These comparisons supply concrete guidance on architecture choice and data use for epidemic preparedness.

Core claim

Across influenza forecasting tasks, a mixture-of-experts model that fuses multiple pretrained forecasters achieves the strongest overall performance, indicating that heterogeneous pretrained representations provide complementary predictive information. Numerical transformer-based models produce reliable forecasts, while pretraining provides the largest gains at longer horizons particularly when the pretraining domain is mechanistically aligned with influenza dynamics. LLM-based time series methods underperform relative to numerical forecasters. Hospitalization information as both an auxiliary covariate and a pretraining source clarifies when additional surveillance streams enhance the robust

What carries the argument

mixture-of-experts model that fuses multiple pretrained forecasters to combine complementary representations from heterogeneous time series pretraining

If this is right

Heterogeneous pretrained representations supply complementary predictive information that improves epidemic time series forecasts.
Pretraining gains are largest at longer horizons when the source domain aligns mechanistically with influenza dynamics.
Numerical transformer models remain reliable while LLM-based methods underperform on this class of structured count data.
Hospitalization signals improve robustness when used as auxiliary covariates or pretraining sources in selected multi-horizon settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Fusion strategies may extend to forecasting other seasonal respiratory diseases whose dynamics share mechanistic features with influenza.
Public-health model pipelines could shift toward ensembles of domain-aligned foundation models rather than single architectures.
The observed complementarity implies that future pretraining corpora should deliberately sample from multiple epidemic mechanisms to maximize transfer.

Load-bearing premise

The chosen influenza-like illness surveillance and hospitalization time series under the stated temporal and spatial generalization settings are representative enough to support general conclusions about model architectures for epidemic forecasting.

What would settle it

A single model or non-mixture architecture achieving consistently lower error than the mixture-of-experts across all tasks, horizons, and both data types on the same surveillance series would falsify the superiority of the fused approach.

Figures

Figures reproduced from arXiv: 2606.19560 by Alireza Jafari, Aniruddha Adiga, Geoffrey C. Fox, Judy Fox, Madhav Marathe.

**Figure 2.** Figure 2: Temporal evaluation on ILI over 1–4-week-ahead forecasting horizons. Normalized sum over regions, comparing model predictions with the observed values over the test data. despite claims of strong ILI performance in its paper [12]. In part, this seems tied to mismatch in horizon design: TimeLLM emphasizes very long or non-standard horizons that are less aligned with CDC’s 1–4-week operational focus; when ev… view at source ↗

read the original abstract

Seasonal influenza infects millions of people and causes substantial morbidity and mortality in the United States each year, making accurate short-term forecasting a core public-health need. Reliable forecasts of epidemic time series can inform vaccination timing, hospital staffing, and resource allocation, yet the comparative behavior of modern forecasting architectures on infectious-disease surveillance data remains insufficiently characterized. We address this gap through a systematic evaluation of regional influenza forecasting using influenza-like illness surveillance and influenza-associated hospitalization time series under both temporal and spatial generalization settings for 1-4-week-ahead prediction. We compare classical neural network architectures, numerical transformer-based models, pretrained time series foundation models, and LLM-based forecasting approaches. Across tasks, we demonstrate that a mixture-of-experts model that fuses multiple pretrained forecasters achieves the strongest overall performance, indicating that heterogeneous pretrained representations provide complementary predictive information. Our results further show that numerical transformer-based models produce reliable forecasts, while pretraining provides the largest gains at longer horizons, particularly when the pretraining domain is mechanistically aligned with influenza dynamics. In contrast, LLM-based time series methods underperform relative to numerical forecasters in this setting. Finally, we examine hospitalization information as both an auxiliary covariate and a pretraining source. Hospitalization signals provide complementary improvements in selected settings and clarify when additional surveillance streams enhance the robustness of multi-horizon forecasting. These findings provide actionable guidance on model selection, pretraining strategy, and auxiliary-signal use for influenza preparedness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Solid empirical ranking of models on flu data but MoE complementarity claim needs ablations to stand.

read the letter

The main thing to know is that this paper runs a systematic comparison of classical neural nets, numerical transformers, pretrained time series foundation models, and LLM-based methods on influenza-like illness and hospitalization time series, using both temporal and spatial generalization splits for 1-4 week forecasts. It reports that a mixture-of-experts model fusing multiple pretrained forecasters performs best overall, with pretraining gains largest at longer horizons when the pretraining domain aligns with epidemic dynamics, and LLM approaches lagging behind.

What the work does well is lay out a clear practical evaluation on real surveillance data and check auxiliary signals like hospitalization both as covariates and pretraining sources. The horizon-dependent pretraining effect and the relative ordering of approaches are legitimate new observations for this specific setting.

The soft spot is the central interpretation of the MoE result. The claim that it shows heterogeneous pretrained representations supply complementary information does not follow from the performance ordering alone. Without controls such as an MoE built from repeated copies of one model, a non-MoE ensemble of the same forecasters, or an ablation of the gating network, the gains could come from routing mechanics, ensemble size, or training details instead. The stress-test note is on target here.

Other concerns are minor: the evaluation assumes the chosen influenza series support broader conclusions about epidemic forecasting architectures, which is reasonable within the subfield but narrows the scope. Details on error bars and statistical tests are not visible in the abstract, though the overall empirical setup avoids circularity.

This paper is for researchers working on time series models for public health or domain-specific foundation model use. A reader needing model-selection guidance for short-horizon flu forecasting would find the rankings and pretraining observations useful.

It deserves peer review because the task is relevant and the experiments address a real gap, even if revisions would be needed to tighten the MoE claim.

Referee Report

1 major / 0 minor

Summary. The manuscript evaluates classical neural networks, numerical transformers, pretrained time series foundation models, and LLM-based forecasters on U.S. regional influenza-like illness and hospitalization time series for 1-4-week-ahead prediction under temporal and spatial generalization. It reports that a mixture-of-experts model fusing multiple pretrained forecasters attains the strongest overall performance and interprets this as evidence that heterogeneous pretrained representations supply complementary predictive information. Additional claims concern the benefits of pretraining (especially domain-aligned) at longer horizons, the relative weakness of LLM-based methods, and the value of hospitalization signals as covariates or pretraining sources.

Significance. If the performance ordering and the complementarity interpretation are substantiated by appropriate controls, the work would supply actionable model-selection guidance for epidemic forecasting and clarify when pretraining and auxiliary streams improve multi-horizon robustness. The absence of such controls currently limits the strength of the headline claim.

major comments (1)

[Abstract] Abstract: the claim that the MoE fusing multiple pretrained forecasters 'indicates that heterogeneous pretrained representations provide complementary predictive information' is not isolated by the reported experiments. No ablations are described that would distinguish heterogeneity from (a) the MoE routing mechanism itself, (b) ensemble size, or (c) the training procedure; controls such as an MoE built from repeated copies of a single pretrained model or a non-MoE ensemble (simple averaging or stacking) of the same forecasters are required to support the interpretation.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the MoE fusing multiple pretrained forecasters 'indicates that heterogeneous pretrained representations provide complementary predictive information' is not isolated by the reported experiments. No ablations are described that would distinguish heterogeneity from (a) the MoE routing mechanism itself, (b) ensemble size, or (c) the training procedure; controls such as an MoE built from repeated copies of a single pretrained model or a non-MoE ensemble (simple averaging or stacking) of the same forecasters are required to support the interpretation.

Authors: We agree that the reported experiments do not include the specific ablations needed to isolate the contribution of heterogeneous pretrained representations from the MoE routing mechanism, ensemble size, or training procedure. While the MoE outperforms the individual pretrained models and other forecasters, this does not fully distinguish the sources of improvement. We will revise the abstract to qualify the interpretive claim and will add the suggested controls (MoE with repeated copies of one model and non-MoE ensembles such as averaging or stacking) in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical model comparisons rest on external benchmarks

full rationale

The paper conducts a systematic empirical evaluation of forecasting architectures on influenza-like illness and hospitalization time series under temporal and spatial generalization. The headline result (MoE fusing pretrained forecasters) is presented as an observed performance ordering across tasks, with no equations, fitted parameters renamed as predictions, or derivations that reduce to inputs by construction. No self-citation chains, uniqueness theorems, or ansatzes are invoked to justify the central claims; all conclusions are tied to direct comparisons against held-out data. This is the standard non-circular case for an evaluation study.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no equations, derivations, or methodological sections from which to extract free parameters, axioms, or invented entities; ledger is therefore empty.

pith-pipeline@v0.9.1-grok · 5801 in / 1155 out tokens · 27414 ms · 2026-06-26T20:54:17.666947+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 29 canonical work pages

[1]

Estimating influenza disease burden from population-based surveillance data in the United States,

C. Reed, S. S. Chaves, P. Daily Kirley, R. Emerson, D. Aragon, E. B. Hancock, L. Butler, J. Baumbach, G. Hollick, N. M. Bennettet al., “Estimating influenza disease burden from population-based surveillance data in the United States,”PLOS ONE, vol. 10, no. 3, p. e0118369,
[2]

Available: https://doi.org/10.1371/journal.pone.0118369

[Online]. Available: https://doi.org/10.1371/journal.pone.0118369

work page doi:10.1371/journal.pone.0118369
[3]

A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States,

N. G. Reich, L. C. Brooks, S. J. Fox, S. Kandula, C. J. McGowan, E. Moore, D. Osthus, E. L. Ray, A. Tushar, T. K. Yamanaet al., “A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States,”Proceedings of the National Academy of Sciences, vol. 116, no. 8, pp. 3146–3154, 2019. [Online]. Available: https://doi.org...

work page doi:10.1073/pnas.1812594116 2019
[4]

The united states covid-19 forecast hub dataset,

E. Y . Cramer, Y . Huang, Y . Wang, E. L. Ray, M. Cornell, J. Bracher, A. Brennen, A. J. Castro Rivadeneira, A. Gerding, K. House, D. Jayawardena, A. H. Kanji, A. Khandelwal, K. Le, J. Niemi, A. Stark, A. Shah, N. Wattanachit, M. W. Zorn, N. G. Reich, and US COVID-19 Forecast Hub Consortium, “The united states covid-19 forecast hub dataset,”Scientific Dat...

work page doi:10.1038/s41597-022-01517-w 2022
[5]

Deep learning foundation and pattern models: Challenges in hydrological time series,

J. He, Y .-J. Chen, A. Jafari, A. Idamekorala, and G. Fox, “Deep learning foundation and pattern models: Challenges in hydrological time series,”The International Journal of High Performance Computing Applications, vol. 40, no. 1, pp. 22–41, 2026. [Online]. Available: https://doi.org/10.1177/10943420251380008

work page doi:10.1177/10943420251380008 2026
[6]

Chronos: Learning the language of time series,

A. F. Ansari, L. Stella, C. Turkmen, X. Zhang, P. Mercado, H. Shen, O. Shchur, S. S. Rangapuram, S. P. Arango, S. Kapooret al., “Chronos: Learning the language of time series,” 2024. [Online]. Available: https://arxiv.org/abs/2403.07815

Pith/arXiv arXiv 2024
[7]

GCNET: Graph-based prediction of stock price movement using graph convolutional network,

A. Jafari and S. Haratizadeh, “GCNET: Graph-based prediction of stock price movement using graph convolutional network,”Engineering Applications of Artificial Intelligence, vol. 116, p. 105452, 2022. [Online]. Available: https://doi.org/10.1016/j.engappai.2022.105452 12

work page doi:10.1016/j.engappai.2022.105452 2022
[8]

Time series foundation models and deep learning architectures for earthquake temporal and spatial nowcasting,

A. Jafari, G. Fox, J. B. Rundle, A. Donnellan, and L. G. Ludwig, “Time series foundation models and deep learning architectures for earthquake temporal and spatial nowcasting,”GeoHazards, vol. 5, no. 4, pp. 1247–1274, 2024. [Online]. Available: https: //doi.org/10.3390/geohazards5040059

work page doi:10.3390/geohazards5040059 2024
[9]

NETpred: Network-based modeling and prediction of multiple connected market indices,

A. Jafari and S. Haratizadeh, “NETpred: Network-based modeling and prediction of multiple connected market indices,” 2022. [Online]. Available: https://arxiv.org/abs/2212.05916

arXiv 2022
[10]

COVID-Transformer: Interpretable COVID-19 detection using vision transformer for healthcare,

D. Shome, T. Kar, S. N. Mohanty, P. Tiwari, K. Muhammad, A. AlTameem, Y . Zhang, and A. K. J. Saudagar, “COVID-Transformer: Interpretable COVID-19 detection using vision transformer for healthcare,”International Journal of Environmental Research and Public Health, vol. 18, no. 21, p. 11086, 2021. [Online]. Available: https://doi.org/10.3390/ijerph182111086

work page doi:10.3390/ijerph182111086 2021
[11]

Interpreting county-level covid-19 infections using transformer and deep learning time series models,

M. K. Islam, Y . Liu, A. Erkelens, N. Daniello, A. Marathe, and J. Fox, “Interpreting county-level covid-19 infections using transformer and deep learning time series models,” in2023 IEEE International Conference on Digital Health (ICDH). IEEE, 2023, pp. 266–277. [Online]. Available: https://doi.org/10.1109/ICDH60066.2023.00046

work page doi:10.1109/icdh60066.2023.00046 2023
[12]

TimesNet: Temporal 2D-variation modeling for general time series analysis,

H. Wu, T. Hu, Y . Liu, H. Zhou, J. Wang, and M. Long, “TimesNet: Temporal 2D-variation modeling for general time series analysis,”arXiv preprint arXiv:2210.02186, 2022, preprint version; prefer citing wu2023timesnet when possible. [Online]. Available: https://arxiv.org/abs/2210.02186

Pith/arXiv arXiv 2022
[13]

Time-LLM: Time series forecasting by reprogramming large language models,

M. Jin, S. Wang, L. Ma, Z. Chu, J. Y . Zhang, X. Shi, P.-Y . Chen, Y . Liang, Y .-F. Li, S. Pan, and Q. Wen, “Time-LLM: Time series forecasting by reprogramming large language models,” inInternational Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=Unb5CVPtae

2024
[14]

Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016,

C. J. McGowan, M. Biggerstaff, M. Johansson, K. M. Apfeldorf, M. Ben-Nun, L. Brooks, M. Convertino, M. Erraguntla, D. C. Farrow, J. Freezeet al., “Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016,”Scientific Reports, vol. 9, no. 1, p. 683,

2015
[15]

Available: https://doi.org/10.1038/s41598-018-36361-9

[Online]. Available: https://doi.org/10.1038/s41598-018-36361-9

work page doi:10.1038/s41598-018-36361-9
[16]

Graph neural network for traffic forecasting: A survey,

W. Jiang and J. Luo, “Graph neural network for traffic forecasting: A survey,”Expert Systems with Applications, vol. 207, p. 117921, 2022. [Online]. Available: https://doi.org/10.1016/j.eswa.2022.117921

work page doi:10.1016/j.eswa.2022.117921 2022
[17]

The predictive skill of convolutional neural networks models for disease forecasting,

K. Lee, J. Ray, and C. Safta, “The predictive skill of convolutional neural networks models for disease forecasting,”PLOS ONE, vol. 16, no. 7, p. e0254319, 2021. [Online]. Available: https: //doi.org/10.1371/journal.pone.0254319

work page doi:10.1371/journal.pone.0254319 2021
[18]

CausalGNN: Causal-based graph neural networks for spatio-temporal epidemic forecasting,

L. Wang, A. Adiga, J. Chen, A. Sadilek, S. Venkatramanan, and M. Marathe, “CausalGNN: Causal-based graph neural networks for spatio-temporal epidemic forecasting,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 11, pp. 12 191–12 199,
[19]

Available: https://doi.org/10.1609/aaai.v36i11.21479

[Online]. Available: https://doi.org/10.1609/aaai.v36i11.21479

work page doi:10.1609/aaai.v36i11.21479
[20]

Enhancing deep traffic forecasting models with dynamic regression,

V . Z. Zheng, S. Choi, and L. Sun, “Enhancing deep traffic forecasting models with dynamic regression,” 2023. [Online]. Available: https://arxiv.org/abs/2301.06650

arXiv 2023
[21]

A comparison of infectious disease forecasting methods across locations, diseases, and time,

S. Dixon, R. Keshavamurthy, D. H. Farber, A. Stevens, K. T. Pazdernik, and L. E. Charles, “A comparison of infectious disease forecasting methods across locations, diseases, and time,”Pathogens, vol. 11, no. 2, p. 185, 2022. [Online]. Available: https://doi.org/10. 3390/pathogens11020185

2022
[22]

Applying infectious disease forecasting to public health: A path forward using influenza forecasting examples,

C. S. Lutz, M. P. Huynh, M. Schroeder, S. Anyatonwu, F. S. Dahlgren, G. Danyluk, D. Fernandez, S. K. Greene, N. Kipshidze, L. Liuet al., “Applying infectious disease forecasting to public health: A path forward using influenza forecasting examples,” BMC Public Health, vol. 19, p. 1659, 2019. [Online]. Available: https://doi.org/10.1186/s12889-019-7966-8

work page doi:10.1186/s12889-019-7966-8 2019
[23]

SEIR modeling of the COVID-19 and its dynamics,

S. He, Y . Peng, and K. Sun, “SEIR modeling of the COVID-19 and its dynamics,”Nonlinear Dynamics, vol. 101, pp. 1667–1680, 2020. [Online]. Available: https://doi.org/10.1007/s11071-020-05743-y

work page doi:10.1007/s11071-020-05743-y 2020
[24]

A simplicial epidemic model for COVID-19 spread analysis,

Y . Chen, Y . R. Gel, M. V . Marathe, and H. V . Poor, “A simplicial epidemic model for COVID-19 spread analysis,”Proceedings of the National Academy of Sciences, vol. 121, no. 1, p. e2313171120, 2024. [Online]. Available: https://doi.org/10.1073/pnas.2313171120

work page doi:10.1073/pnas.2313171120 2024
[25]

Informing university COVID-19 decisions using simple compartmental models,

B. Hurt, A. Adiga, M. Marathe, and C. L. Barrett, “Informing university COVID-19 decisions using simple compartmental models,” in2021 Winter Simulation Conference (WSC), 2021, pp. 1–12. [Online]. Available: https://doi.org/10.1109/WSC52266.2021.9715467

work page doi:10.1109/wsc52266.2021.9715467 2021
[26]

Rational evaluation of various epidemic models based on the COVID-19 data of China,

W. Yang, D. Zhang, L. Peng, C. Zhuge, and L. Hong, “Rational evaluation of various epidemic models based on the COVID-19 data of China,”Epidemics, vol. 37, p. 100501, 2021. [Online]. Available: https://doi.org/10.1016/j.epidem.2021.100501

work page doi:10.1016/j.epidem.2021.100501 2021
[27]

An overview of forecast analysis with ARIMA models during the COVID-19 pandemic: Methodology and case study in Brazil,

R. Ospina, J. A. M. Gondim, V . Leiva, and C. Castro, “An overview of forecast analysis with ARIMA models during the COVID-19 pandemic: Methodology and case study in Brazil,”Mathematics, vol. 11, no. 14, p. 3069, 2023. [Online]. Available: https://doi.org/10.3390/math11143069

work page doi:10.3390/math11143069 2023
[28]

Prediction of global Omicron pandemic using ARIMA, MLR, and Prophet models,

D. Zhao, R. Zhang, H. Zhang, and S. He, “Prediction of global Omicron pandemic using ARIMA, MLR, and Prophet models,” Scientific Reports, vol. 12, p. 18138, 2022. [Online]. Available: https://doi.org/10.1038/s41598-022-23154-4

work page doi:10.1038/s41598-022-23154-4 2022
[29]

In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 7817–7825, https://doi.org/10.1609/aaai

A. Adiga, G. Kaur, L. Wang, B. Hurt, P. Porebski, S. Venkatramanan, B. Lewis, and M. V . Marathe, “Phase-informed bayesian ensemble models improve performance of covid-19 forecasts,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 13, pp. 15 647–15 653, 2024. [Online]. Available: https://doi.org/10.1609/aaai. v37i13.26855

work page doi:10.1609/aaai 2024
[30]

Cola-GNN: Cross-location attention based graph neural networks for long-term ILI prediction,

S. Deng, S. Wang, H. Rangwala, L. Wang, and Y . Ning, “Cola-GNN: Cross-location attention based graph neural networks for long-term ILI prediction,” inProceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 245–254. [Online]. Available: https://doi.org/10.1145/3340531.3411975

work page doi:10.1145/3340531.3411975 2020
[31]

EpiGNN: Exploring spatial transmission with graph neural network for regional epidemic forecasting,

F. Xie, Z. Zhang, L. Li, B. Zhou, and Y . Tan, “EpiGNN: Exploring spatial transmission with graph neural network for regional epidemic forecasting,” inMachine Learning and Knowledge Discovery in Databases: ECML PKDD 2022, ser. Lecture Notes in Computer Science, vol. 13718. Springer, 2023, pp. 469–485. [Online]. Available: https://doi.org/10.1007/978-3-031...

work page doi:10.1007/978-3-031-26422-1 2022
[32]

Proceedings of the 32nd ACM International Conference on Information and Knowledge Management , pages =

M. Liu, Y . Liu, and J. Liu, “Epidemiology-aware deep learning for infectious disease dynamics prediction,” inProceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023, pp. 4084–4088. [Online]. Available: https: //doi.org/10.1145/3583780.3615139

work page doi:10.1145/3583780.3615139 2023
[33]

RESEAT: Recurrent self-attention network for multi-regional influenza forecasting,

J. Moon, S. Jung, S. Park, and E. Hwang, “RESEAT: Recurrent self-attention network for multi-regional influenza forecasting,”IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 5, pp. 2585–2596, 2023. [Online]. Available: https://doi.org/10.1109/JBHI. 2023.3247687

work page doi:10.1109/jbhi 2023
[34]

Self-attention-based deep learning network for regional influenza forecasting,

S. Jung, J. Moon, S. Park, and E. Hwang, “Self-attention-based deep learning network for regional influenza forecasting,”IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 2, pp. 922–933, 2022. [Online]. Available: https://doi.org/10.1109/JBHI.2021.3093897

work page doi:10.1109/jbhi.2021.3093897 2022
[35]

An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,” 2018. [Online]. Available: https://arxiv.org/abs/1803.01271

Pith/arXiv arXiv 2018
[36]

Long-term forecasting with TiDE: Time-series dense encoder,

A. Das, W. Kong, A. Leach, S. Mathur, R. Sen, and R. Yu, “Long-term forecasting with TiDE: Time-series dense encoder,” Transactions on Machine Learning Research, 2023. [Online]. Available: https://openreview.net/forum?id=pCbC3aQB5W

2023
[37]

A time series is worth 64 words: Long-term forecasting with transformers,

Y . Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “A time series is worth 64 words: Long-term forecasting with transformers,” in International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=Jbdc0vTOcol

2023
[38]

iTransformer: Inverted transformers are effective for time series forecasting,

Y . Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma, and M. Long, “iTransformer: Inverted transformers are effective for time series forecasting,” inInternational Conference on Learning Representations,
[39]

Available: https://openreview.net/forum?id=JePfAI8fah

[Online]. Available: https://openreview.net/forum?id=JePfAI8fah
[40]

Temporal fusion transformers for interpretable multi-horizon time series forecasting,

B. Lim, S. O. Arik, N. Loeff, and T. Pfister, “Temporal fusion transformers for interpretable multi-horizon time series forecasting,” International Journal of Forecasting, vol. 37, no. 4, pp. 1748–1764,
[41]

Available: https://doi.org/10.1016/j.ijforecast.2021.03

[Online]. Available: https://doi.org/10.1016/j.ijforecast.2021.03. 012

work page doi:10.1016/j.ijforecast.2021.03 2021
[42]

FluSight: Forecasts of flu hospital admissions,

Centers for Disease Control and Prevention, “FluSight: Forecasts of flu hospital admissions,” Online, 2023, accessed: 2026-06-

2023
[43]

Available: https://www.cdc.gov/flu-forecasting/data-vis/ current-week.html

[Online]. Available: https://www.cdc.gov/flu-forecasting/data-vis/ current-week.html
[44]

Monash time series forecasting archive,

R. Godahewa, C. Bergmeir, G. I. Webb, R. J. Hyndman, and P. Montero-Manso, “Monash time series forecasting archive,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021. [Online]. Available: https: //openreview.net/forum?id=I01l7rc0jcb

2021
[45]

doi: https://doi.org/10.1016/j.ijforecast.2019.04.014

S. Makridakis, E. Spiliotis, and V . Assimakopoulos, “The m4 competition: 100,000 time series and 61 forecasting methods,” International Journal of Forecasting, vol. 36, no. 1, pp. 54–74, 2020. [Online]. Available: https://doi.org/10.1016/j.ijforecast.2019.04.014

work page doi:10.1016/j.ijforecast.2019.04.014 2020
[46]

Application of a normalized Nash– Sutcliffe efficiency to improve the accuracy of the Sobol’ sensitivity analysis of a hydrological model,

J. Nossent and W. Bauwens, “Application of a normalized Nash– Sutcliffe efficiency to improve the accuracy of the Sobol’ sensitivity analysis of a hydrological model,” inEGU General Assembly Conference Abstracts, vol. 14, 2012, p. 237. [Online]. Available: https://meetingorganizer.copernicus.org/EGU2012/EGU2012-237.pdf

2012
[47]

Position: Temporal measurement interval determines computational and model complexity 13 in single-cell perturbation analysis,

A. Jafari, H. Shakeri, and H. Daneshmand, “Position: Temporal measurement interval determines computational and model complexity 13 in single-cell perturbation analysis,” inProceedings of the 43rd International Conference on Machine Learning, 2026, spotlight position paper. [Online]. Available: https://openreview.net/forum?id= lECKpTE1lW

2026
[48]

NeuralForecast: User-friendly state-of-the-art neural forecasting models,

K. G. Olivares, C. Challu, F. Garza, M. Mergenthaler Canseco, and A. Dubrawski, “NeuralForecast: User-friendly state-of-the-art neural forecasting models,” PyCon Salt Lake City, Utah, US, 2022. [Online]. Available: https://github.com/Nixtla/neuralforecast

2022
[49]

Statsmodels: Econometric and statistical modeling with Python,

S. Seabold and J. Perktold, “Statsmodels: Econometric and statistical modeling with Python,” inProceedings of the 9th Python in Science Conference, Austin, TX, 2010, pp. 92–96. [Online]. Available: https://conference.scipy.org/proceedings/scipy2010/seabold.html

2010
[50]

Neural Computation 9(8), 1735–1780 (1997) https://doi.org/10.1162/neco.1997.9.8.1735

S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. [Online]. Available: https://doi.org/10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997
[51]

G. E. P. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung,Time Series Analysis: Forecasting and Control, 5th ed. Hoboken, NJ: John Wiley & Sons, 2015. [Online]. Available: https://www.wiley.com/en-us/Time+Series+Analysis%3A+ Forecasting+and+Control%2C+5th+Edition-p-9781118675021

2015
[52]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems, vol. 30,
[53]

Available: https://proceedings.neurips.cc/paper files/ paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

[Online]. Available: https://proceedings.neurips.cc/paper files/ paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

2017
[54]

Informer: Beyond efficient transformer for long sequence time-series forecasting,

H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 12, 2021, pp. 11 106–11 115. [Online]. Available: https://doi.org/10.1609/aaai.v35i12.17325

work page doi:10.1609/aaai.v35i12.17325 2021
[55]

TSMixer: An all-MLP architecture for time series forecasting,

S.-A. Chen, C.-L. Li, N. Yoder, S. O. Arik, and T. Pfister, “TSMixer: An all-MLP architecture for time series forecasting,” 2023. [Online]. Available: https://arxiv.org/abs/2303.06053 APPENDIX MODELS’ CONFIGURATIONS ANDHYPERPARAMETERS This appendix reports the implementation details and hyper- parameter settings used to reproduce the main forecasting expe...

arXiv 2023

[1] [1]

Estimating influenza disease burden from population-based surveillance data in the United States,

C. Reed, S. S. Chaves, P. Daily Kirley, R. Emerson, D. Aragon, E. B. Hancock, L. Butler, J. Baumbach, G. Hollick, N. M. Bennettet al., “Estimating influenza disease burden from population-based surveillance data in the United States,”PLOS ONE, vol. 10, no. 3, p. e0118369,

[2] [2]

Available: https://doi.org/10.1371/journal.pone.0118369

[Online]. Available: https://doi.org/10.1371/journal.pone.0118369

work page doi:10.1371/journal.pone.0118369

[3] [3]

A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States,

N. G. Reich, L. C. Brooks, S. J. Fox, S. Kandula, C. J. McGowan, E. Moore, D. Osthus, E. L. Ray, A. Tushar, T. K. Yamanaet al., “A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States,”Proceedings of the National Academy of Sciences, vol. 116, no. 8, pp. 3146–3154, 2019. [Online]. Available: https://doi.org...

work page doi:10.1073/pnas.1812594116 2019

[4] [4]

The united states covid-19 forecast hub dataset,

E. Y . Cramer, Y . Huang, Y . Wang, E. L. Ray, M. Cornell, J. Bracher, A. Brennen, A. J. Castro Rivadeneira, A. Gerding, K. House, D. Jayawardena, A. H. Kanji, A. Khandelwal, K. Le, J. Niemi, A. Stark, A. Shah, N. Wattanachit, M. W. Zorn, N. G. Reich, and US COVID-19 Forecast Hub Consortium, “The united states covid-19 forecast hub dataset,”Scientific Dat...

work page doi:10.1038/s41597-022-01517-w 2022

[5] [5]

Deep learning foundation and pattern models: Challenges in hydrological time series,

J. He, Y .-J. Chen, A. Jafari, A. Idamekorala, and G. Fox, “Deep learning foundation and pattern models: Challenges in hydrological time series,”The International Journal of High Performance Computing Applications, vol. 40, no. 1, pp. 22–41, 2026. [Online]. Available: https://doi.org/10.1177/10943420251380008

work page doi:10.1177/10943420251380008 2026

[6] [6]

Chronos: Learning the language of time series,

A. F. Ansari, L. Stella, C. Turkmen, X. Zhang, P. Mercado, H. Shen, O. Shchur, S. S. Rangapuram, S. P. Arango, S. Kapooret al., “Chronos: Learning the language of time series,” 2024. [Online]. Available: https://arxiv.org/abs/2403.07815

Pith/arXiv arXiv 2024

[7] [7]

GCNET: Graph-based prediction of stock price movement using graph convolutional network,

A. Jafari and S. Haratizadeh, “GCNET: Graph-based prediction of stock price movement using graph convolutional network,”Engineering Applications of Artificial Intelligence, vol. 116, p. 105452, 2022. [Online]. Available: https://doi.org/10.1016/j.engappai.2022.105452 12

work page doi:10.1016/j.engappai.2022.105452 2022

[8] [8]

Time series foundation models and deep learning architectures for earthquake temporal and spatial nowcasting,

A. Jafari, G. Fox, J. B. Rundle, A. Donnellan, and L. G. Ludwig, “Time series foundation models and deep learning architectures for earthquake temporal and spatial nowcasting,”GeoHazards, vol. 5, no. 4, pp. 1247–1274, 2024. [Online]. Available: https: //doi.org/10.3390/geohazards5040059

work page doi:10.3390/geohazards5040059 2024

[9] [9]

NETpred: Network-based modeling and prediction of multiple connected market indices,

A. Jafari and S. Haratizadeh, “NETpred: Network-based modeling and prediction of multiple connected market indices,” 2022. [Online]. Available: https://arxiv.org/abs/2212.05916

arXiv 2022

[10] [10]

COVID-Transformer: Interpretable COVID-19 detection using vision transformer for healthcare,

D. Shome, T. Kar, S. N. Mohanty, P. Tiwari, K. Muhammad, A. AlTameem, Y . Zhang, and A. K. J. Saudagar, “COVID-Transformer: Interpretable COVID-19 detection using vision transformer for healthcare,”International Journal of Environmental Research and Public Health, vol. 18, no. 21, p. 11086, 2021. [Online]. Available: https://doi.org/10.3390/ijerph182111086

work page doi:10.3390/ijerph182111086 2021

[11] [11]

Interpreting county-level covid-19 infections using transformer and deep learning time series models,

M. K. Islam, Y . Liu, A. Erkelens, N. Daniello, A. Marathe, and J. Fox, “Interpreting county-level covid-19 infections using transformer and deep learning time series models,” in2023 IEEE International Conference on Digital Health (ICDH). IEEE, 2023, pp. 266–277. [Online]. Available: https://doi.org/10.1109/ICDH60066.2023.00046

work page doi:10.1109/icdh60066.2023.00046 2023

[12] [12]

TimesNet: Temporal 2D-variation modeling for general time series analysis,

H. Wu, T. Hu, Y . Liu, H. Zhou, J. Wang, and M. Long, “TimesNet: Temporal 2D-variation modeling for general time series analysis,”arXiv preprint arXiv:2210.02186, 2022, preprint version; prefer citing wu2023timesnet when possible. [Online]. Available: https://arxiv.org/abs/2210.02186

Pith/arXiv arXiv 2022

[13] [13]

Time-LLM: Time series forecasting by reprogramming large language models,

M. Jin, S. Wang, L. Ma, Z. Chu, J. Y . Zhang, X. Shi, P.-Y . Chen, Y . Liang, Y .-F. Li, S. Pan, and Q. Wen, “Time-LLM: Time series forecasting by reprogramming large language models,” inInternational Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=Unb5CVPtae

2024

[14] [14]

Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016,

C. J. McGowan, M. Biggerstaff, M. Johansson, K. M. Apfeldorf, M. Ben-Nun, L. Brooks, M. Convertino, M. Erraguntla, D. C. Farrow, J. Freezeet al., “Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016,”Scientific Reports, vol. 9, no. 1, p. 683,

2015

[15] [15]

Available: https://doi.org/10.1038/s41598-018-36361-9

[Online]. Available: https://doi.org/10.1038/s41598-018-36361-9

work page doi:10.1038/s41598-018-36361-9

[16] [16]

Graph neural network for traffic forecasting: A survey,

W. Jiang and J. Luo, “Graph neural network for traffic forecasting: A survey,”Expert Systems with Applications, vol. 207, p. 117921, 2022. [Online]. Available: https://doi.org/10.1016/j.eswa.2022.117921

work page doi:10.1016/j.eswa.2022.117921 2022

[17] [17]

The predictive skill of convolutional neural networks models for disease forecasting,

K. Lee, J. Ray, and C. Safta, “The predictive skill of convolutional neural networks models for disease forecasting,”PLOS ONE, vol. 16, no. 7, p. e0254319, 2021. [Online]. Available: https: //doi.org/10.1371/journal.pone.0254319

work page doi:10.1371/journal.pone.0254319 2021

[18] [18]

CausalGNN: Causal-based graph neural networks for spatio-temporal epidemic forecasting,

L. Wang, A. Adiga, J. Chen, A. Sadilek, S. Venkatramanan, and M. Marathe, “CausalGNN: Causal-based graph neural networks for spatio-temporal epidemic forecasting,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 11, pp. 12 191–12 199,

[19] [19]

Available: https://doi.org/10.1609/aaai.v36i11.21479

[Online]. Available: https://doi.org/10.1609/aaai.v36i11.21479

work page doi:10.1609/aaai.v36i11.21479

[20] [20]

Enhancing deep traffic forecasting models with dynamic regression,

V . Z. Zheng, S. Choi, and L. Sun, “Enhancing deep traffic forecasting models with dynamic regression,” 2023. [Online]. Available: https://arxiv.org/abs/2301.06650

arXiv 2023

[21] [21]

A comparison of infectious disease forecasting methods across locations, diseases, and time,

S. Dixon, R. Keshavamurthy, D. H. Farber, A. Stevens, K. T. Pazdernik, and L. E. Charles, “A comparison of infectious disease forecasting methods across locations, diseases, and time,”Pathogens, vol. 11, no. 2, p. 185, 2022. [Online]. Available: https://doi.org/10. 3390/pathogens11020185

2022

[22] [22]

Applying infectious disease forecasting to public health: A path forward using influenza forecasting examples,

C. S. Lutz, M. P. Huynh, M. Schroeder, S. Anyatonwu, F. S. Dahlgren, G. Danyluk, D. Fernandez, S. K. Greene, N. Kipshidze, L. Liuet al., “Applying infectious disease forecasting to public health: A path forward using influenza forecasting examples,” BMC Public Health, vol. 19, p. 1659, 2019. [Online]. Available: https://doi.org/10.1186/s12889-019-7966-8

work page doi:10.1186/s12889-019-7966-8 2019

[23] [23]

SEIR modeling of the COVID-19 and its dynamics,

S. He, Y . Peng, and K. Sun, “SEIR modeling of the COVID-19 and its dynamics,”Nonlinear Dynamics, vol. 101, pp. 1667–1680, 2020. [Online]. Available: https://doi.org/10.1007/s11071-020-05743-y

work page doi:10.1007/s11071-020-05743-y 2020

[24] [24]

A simplicial epidemic model for COVID-19 spread analysis,

Y . Chen, Y . R. Gel, M. V . Marathe, and H. V . Poor, “A simplicial epidemic model for COVID-19 spread analysis,”Proceedings of the National Academy of Sciences, vol. 121, no. 1, p. e2313171120, 2024. [Online]. Available: https://doi.org/10.1073/pnas.2313171120

work page doi:10.1073/pnas.2313171120 2024

[25] [25]

Informing university COVID-19 decisions using simple compartmental models,

B. Hurt, A. Adiga, M. Marathe, and C. L. Barrett, “Informing university COVID-19 decisions using simple compartmental models,” in2021 Winter Simulation Conference (WSC), 2021, pp. 1–12. [Online]. Available: https://doi.org/10.1109/WSC52266.2021.9715467

work page doi:10.1109/wsc52266.2021.9715467 2021

[26] [26]

Rational evaluation of various epidemic models based on the COVID-19 data of China,

W. Yang, D. Zhang, L. Peng, C. Zhuge, and L. Hong, “Rational evaluation of various epidemic models based on the COVID-19 data of China,”Epidemics, vol. 37, p. 100501, 2021. [Online]. Available: https://doi.org/10.1016/j.epidem.2021.100501

work page doi:10.1016/j.epidem.2021.100501 2021

[27] [27]

An overview of forecast analysis with ARIMA models during the COVID-19 pandemic: Methodology and case study in Brazil,

R. Ospina, J. A. M. Gondim, V . Leiva, and C. Castro, “An overview of forecast analysis with ARIMA models during the COVID-19 pandemic: Methodology and case study in Brazil,”Mathematics, vol. 11, no. 14, p. 3069, 2023. [Online]. Available: https://doi.org/10.3390/math11143069

work page doi:10.3390/math11143069 2023

[28] [28]

Prediction of global Omicron pandemic using ARIMA, MLR, and Prophet models,

D. Zhao, R. Zhang, H. Zhang, and S. He, “Prediction of global Omicron pandemic using ARIMA, MLR, and Prophet models,” Scientific Reports, vol. 12, p. 18138, 2022. [Online]. Available: https://doi.org/10.1038/s41598-022-23154-4

work page doi:10.1038/s41598-022-23154-4 2022

[29] [29]

In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 7817–7825, https://doi.org/10.1609/aaai

A. Adiga, G. Kaur, L. Wang, B. Hurt, P. Porebski, S. Venkatramanan, B. Lewis, and M. V . Marathe, “Phase-informed bayesian ensemble models improve performance of covid-19 forecasts,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 13, pp. 15 647–15 653, 2024. [Online]. Available: https://doi.org/10.1609/aaai. v37i13.26855

work page doi:10.1609/aaai 2024

[30] [30]

Cola-GNN: Cross-location attention based graph neural networks for long-term ILI prediction,

S. Deng, S. Wang, H. Rangwala, L. Wang, and Y . Ning, “Cola-GNN: Cross-location attention based graph neural networks for long-term ILI prediction,” inProceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 245–254. [Online]. Available: https://doi.org/10.1145/3340531.3411975

work page doi:10.1145/3340531.3411975 2020

[31] [31]

EpiGNN: Exploring spatial transmission with graph neural network for regional epidemic forecasting,

F. Xie, Z. Zhang, L. Li, B. Zhou, and Y . Tan, “EpiGNN: Exploring spatial transmission with graph neural network for regional epidemic forecasting,” inMachine Learning and Knowledge Discovery in Databases: ECML PKDD 2022, ser. Lecture Notes in Computer Science, vol. 13718. Springer, 2023, pp. 469–485. [Online]. Available: https://doi.org/10.1007/978-3-031...

work page doi:10.1007/978-3-031-26422-1 2022

[32] [32]

Proceedings of the 32nd ACM International Conference on Information and Knowledge Management , pages =

M. Liu, Y . Liu, and J. Liu, “Epidemiology-aware deep learning for infectious disease dynamics prediction,” inProceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023, pp. 4084–4088. [Online]. Available: https: //doi.org/10.1145/3583780.3615139

work page doi:10.1145/3583780.3615139 2023

[33] [33]

RESEAT: Recurrent self-attention network for multi-regional influenza forecasting,

J. Moon, S. Jung, S. Park, and E. Hwang, “RESEAT: Recurrent self-attention network for multi-regional influenza forecasting,”IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 5, pp. 2585–2596, 2023. [Online]. Available: https://doi.org/10.1109/JBHI. 2023.3247687

work page doi:10.1109/jbhi 2023

[34] [34]

Self-attention-based deep learning network for regional influenza forecasting,

S. Jung, J. Moon, S. Park, and E. Hwang, “Self-attention-based deep learning network for regional influenza forecasting,”IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 2, pp. 922–933, 2022. [Online]. Available: https://doi.org/10.1109/JBHI.2021.3093897

work page doi:10.1109/jbhi.2021.3093897 2022

[35] [35]

An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,” 2018. [Online]. Available: https://arxiv.org/abs/1803.01271

Pith/arXiv arXiv 2018

[36] [36]

Long-term forecasting with TiDE: Time-series dense encoder,

A. Das, W. Kong, A. Leach, S. Mathur, R. Sen, and R. Yu, “Long-term forecasting with TiDE: Time-series dense encoder,” Transactions on Machine Learning Research, 2023. [Online]. Available: https://openreview.net/forum?id=pCbC3aQB5W

2023

[37] [37]

A time series is worth 64 words: Long-term forecasting with transformers,

Y . Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “A time series is worth 64 words: Long-term forecasting with transformers,” in International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=Jbdc0vTOcol

2023

[38] [38]

iTransformer: Inverted transformers are effective for time series forecasting,

Y . Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma, and M. Long, “iTransformer: Inverted transformers are effective for time series forecasting,” inInternational Conference on Learning Representations,

[39] [39]

Available: https://openreview.net/forum?id=JePfAI8fah

[Online]. Available: https://openreview.net/forum?id=JePfAI8fah

[40] [40]

Temporal fusion transformers for interpretable multi-horizon time series forecasting,

B. Lim, S. O. Arik, N. Loeff, and T. Pfister, “Temporal fusion transformers for interpretable multi-horizon time series forecasting,” International Journal of Forecasting, vol. 37, no. 4, pp. 1748–1764,

[41] [41]

Available: https://doi.org/10.1016/j.ijforecast.2021.03

[Online]. Available: https://doi.org/10.1016/j.ijforecast.2021.03. 012

work page doi:10.1016/j.ijforecast.2021.03 2021

[42] [42]

FluSight: Forecasts of flu hospital admissions,

Centers for Disease Control and Prevention, “FluSight: Forecasts of flu hospital admissions,” Online, 2023, accessed: 2026-06-

2023

[43] [43]

Available: https://www.cdc.gov/flu-forecasting/data-vis/ current-week.html

[Online]. Available: https://www.cdc.gov/flu-forecasting/data-vis/ current-week.html

[44] [44]

Monash time series forecasting archive,

R. Godahewa, C. Bergmeir, G. I. Webb, R. J. Hyndman, and P. Montero-Manso, “Monash time series forecasting archive,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021. [Online]. Available: https: //openreview.net/forum?id=I01l7rc0jcb

2021

[45] [45]

doi: https://doi.org/10.1016/j.ijforecast.2019.04.014

S. Makridakis, E. Spiliotis, and V . Assimakopoulos, “The m4 competition: 100,000 time series and 61 forecasting methods,” International Journal of Forecasting, vol. 36, no. 1, pp. 54–74, 2020. [Online]. Available: https://doi.org/10.1016/j.ijforecast.2019.04.014

work page doi:10.1016/j.ijforecast.2019.04.014 2020

[46] [46]

Application of a normalized Nash– Sutcliffe efficiency to improve the accuracy of the Sobol’ sensitivity analysis of a hydrological model,

J. Nossent and W. Bauwens, “Application of a normalized Nash– Sutcliffe efficiency to improve the accuracy of the Sobol’ sensitivity analysis of a hydrological model,” inEGU General Assembly Conference Abstracts, vol. 14, 2012, p. 237. [Online]. Available: https://meetingorganizer.copernicus.org/EGU2012/EGU2012-237.pdf

2012

[47] [47]

Position: Temporal measurement interval determines computational and model complexity 13 in single-cell perturbation analysis,

A. Jafari, H. Shakeri, and H. Daneshmand, “Position: Temporal measurement interval determines computational and model complexity 13 in single-cell perturbation analysis,” inProceedings of the 43rd International Conference on Machine Learning, 2026, spotlight position paper. [Online]. Available: https://openreview.net/forum?id= lECKpTE1lW

2026

[48] [48]

NeuralForecast: User-friendly state-of-the-art neural forecasting models,

K. G. Olivares, C. Challu, F. Garza, M. Mergenthaler Canseco, and A. Dubrawski, “NeuralForecast: User-friendly state-of-the-art neural forecasting models,” PyCon Salt Lake City, Utah, US, 2022. [Online]. Available: https://github.com/Nixtla/neuralforecast

2022

[49] [49]

Statsmodels: Econometric and statistical modeling with Python,

S. Seabold and J. Perktold, “Statsmodels: Econometric and statistical modeling with Python,” inProceedings of the 9th Python in Science Conference, Austin, TX, 2010, pp. 92–96. [Online]. Available: https://conference.scipy.org/proceedings/scipy2010/seabold.html

2010

[50] [50]

Neural Computation 9(8), 1735–1780 (1997) https://doi.org/10.1162/neco.1997.9.8.1735

S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. [Online]. Available: https://doi.org/10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997

[51] [51]

G. E. P. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung,Time Series Analysis: Forecasting and Control, 5th ed. Hoboken, NJ: John Wiley & Sons, 2015. [Online]. Available: https://www.wiley.com/en-us/Time+Series+Analysis%3A+ Forecasting+and+Control%2C+5th+Edition-p-9781118675021

2015

[52] [52]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems, vol. 30,

[53] [53]

Available: https://proceedings.neurips.cc/paper files/ paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

[Online]. Available: https://proceedings.neurips.cc/paper files/ paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

2017

[54] [54]

Informer: Beyond efficient transformer for long sequence time-series forecasting,

H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 12, 2021, pp. 11 106–11 115. [Online]. Available: https://doi.org/10.1609/aaai.v35i12.17325

work page doi:10.1609/aaai.v35i12.17325 2021

[55] [55]

TSMixer: An all-MLP architecture for time series forecasting,

S.-A. Chen, C.-L. Li, N. Yoder, S. O. Arik, and T. Pfister, “TSMixer: An all-MLP architecture for time series forecasting,” 2023. [Online]. Available: https://arxiv.org/abs/2303.06053 APPENDIX MODELS’ CONFIGURATIONS ANDHYPERPARAMETERS This appendix reports the implementation details and hyper- parameter settings used to reproduce the main forecasting expe...

arXiv 2023