pith. sign in

arxiv: 2509.14933 · v3 · submitted 2025-09-18 · 💻 cs.LG

DAG: A Dual Correlation Network for Time Series Forecasting with Exogenous Variables

Pith reviewed 2026-05-18 15:26 UTC · model grok-4.3

classification 💻 cs.LG
keywords time series forecastingexogenous variablescorrelation networktemporal correlationchannel correlationforecasting with covariates
0
0 comments X

The pith

A dual correlation network discovers and injects temporal and channel links to use future exogenous variables in time series forecasts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DAG to fix two gaps in time series forecasting with exogenous variables: methods usually ignore known future covariates and miss how exogenous series relate to the target series. It builds separate modules for time and for variables, each discovering how past exogenous values shape both future exogenous values and past targets, then feeding those relationships forward when predicting future targets. The approach matters for applications like demand forecasting where promotions or weather are known ahead of time but current models treat them as simple add-ons. If the dual modules succeed, forecasts gain accuracy by explicitly routing correlation effects rather than hoping a generic network finds them. The design keeps discovery and injection as distinct steps so the benefit of correlation modeling can be measured separately from the base predictor.

Core claim

DAG uses a Temporal Correlation Module and a Channel Correlation Module. The Temporal Correlation Module finds how historical exogenous variables affect future exogenous variables and historical endogenous variables; the Channel Correlation Module does the same across variable channels. Each module contains a discovery submodule that identifies these effects and an injection submodule that inserts the discovered relationships into the forecasting step that produces future endogenous variables from historical endogenous variables plus future exogenous variables.

What carries the argument

Dual Correlation Network built from Temporal Correlation Module and Channel Correlation Module, each containing a correlation discovery submodule and a correlation injection submodule.

If this is right

  • Forecasts improve when future exogenous variables are used through explicit correlation paths instead of being treated as independent inputs.
  • The model can capture how exogenous history influences both future exogenous values and past targets before predicting the target future.
  • Separating discovery from injection lets the benefit of correlation modeling be isolated and measured against standard forecasting backbones.
  • The same structure applies to any setting where some covariates are known ahead of the forecast horizon.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same discovery-plus-injection pattern could be tested on purely multivariate series to capture internal cross-variable dependencies without external covariates.
  • Longer forecast horizons would reveal whether the injected correlations remain stable or degrade as distance from the observed window increases.
  • Datasets with noisy or partially observed exogenous variables could test whether the discovery submodule still adds value or begins to inject harmful signals.

Load-bearing premise

The relationships found by the discovery submodules are genuine predictive links that improve forecasts when future exogenous variables are supplied.

What would settle it

An ablation that disables the discovery and injection submodules on a dataset with known future exogenous variables and measures whether accuracy drops compared with a baseline that simply concatenates historical targets and future covariates.

Figures

Figures reproduced from arXiv: 2509.14933 by Bin Yang, Jilin Hu, Xiangfei Qiu, Xingjian Wu, Yuhan Zhu, Zhengyu Li.

Figure 1
Figure 1. Figure 1: Time series forecasting algorithms can be classified [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: The architecture of DAG. (a) Overview of the DAG framework, which comprises Temporal Causal Modules ( [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Parameter sensitivity studies of main hyper-parameters in DAG. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Forecasting performance (MSE) with varying look [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
read the original abstract

Time series forecasting is essential in various domains. Compared to relying solely on endogenous variables (i.e., target variables), considering exogenous variables (i.e., covariates) provides additional predictive information and often leads to more accurate predictions. However, existing methods for time series forecasting with exogenous variables (TSF-X) have the following shortcomings: 1) they do not leverage future exogenous variables, 2) they fail to fully account for the correlation between endogenous and exogenous variables. In this study, to better leverage exogenous variables, especially future exogenous variables, we propose DAG, which utilizes Dual correlAtion network along both the temporal and channel dimensions for time series forecasting with exoGenous variables. Specifically, we propose two core components: the Temporal Correlation Module and the Channel Correlation Module. Both modules consist of a correlation discovery submodule and a correlation injection submodule. The former is designed to capture the correlation effects of historical exogenous variables on future exogenous variables and on historical endogenous variables, respectively. The latter injects the discovered correlation relationships into the processes of forecasting future endogenous variables based on historical endogenous variables and future exogenous variables.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper proposes DAG, a dual correlation network for time series forecasting with exogenous variables (TSF-X). It introduces a Temporal Correlation Module to capture effects of historical exogenous variables on future exogenous variables and a Channel Correlation Module to model correlations between historical exogenous and endogenous variables. Each module contains a correlation discovery submodule (to identify relationships) and a correlation injection submodule (to integrate them into forecasting future endogenous variables from historical endogenous and future exogenous inputs), addressing limitations in prior TSF-X methods that ignore future covariates or fail to fully exploit endo-exo correlations.

Significance. If the reported gains hold under the provided experimental conditions, the work offers a targeted improvement for TSF-X settings where future exogenous variables are observable, by explicitly separating and injecting temporal and channel-wise correlations rather than treating covariates as simple additional inputs. The inclusion of ablations that isolate the discovery and injection components provides direct evidence for their contribution, which is a strength for reproducibility and interpretability in this domain.

major comments (1)
  1. [§4.3] §4.3, ablation table: the reported improvement from adding the Channel Correlation Module is only 1.2% on average across datasets; this is modest relative to the added complexity and raises the question whether the channel-wise correlation discovery is load-bearing or largely redundant with the temporal module in practice.
minor comments (3)
  1. [§3.2] §3.2: the notation for the correlation matrices (e.g., C_t and C_c) is introduced without an explicit definition of their dimensions or how they are computed from the input tensors; adding a small diagram or equation would improve clarity.
  2. [Figure 2] Figure 2: the flowchart for the injection submodule uses the same arrow style for both discovery and injection paths, making it difficult to distinguish the flow of discovered correlations from the main forecasting path.
  3. [§5] §5: the discussion of limitations mentions only computational overhead but does not address sensitivity to noisy or missing future exogenous variables, which is a practical concern for deployment.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and positive recommendation for minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: [§4.3] §4.3, ablation table: the reported improvement from adding the Channel Correlation Module is only 1.2% on average across datasets; this is modest relative to the added complexity and raises the question whether the channel-wise correlation discovery is load-bearing or largely redundant with the temporal module in practice.

    Authors: We appreciate the referee raising this point about the ablation results. While the average gain of 1.2% is indeed modest, the Channel Correlation Module is designed to capture a distinct aspect of the problem—channel-wise correlations between historical endogenous variables and exogenous variables—that is orthogonal to the temporal correlations modeled by the Temporal Correlation Module. The two modules therefore address complementary relationships rather than overlapping ones, as evidenced by the fact that each ablation (removing either module) produces a measurable drop. The added parameters are limited because both modules reuse the same lightweight correlation discovery and injection submodules. In the revised manuscript we will expand the discussion in §4.3 to explicitly articulate this complementarity and report per-dataset improvements to show where the channel module contributes most. revision: partial

Circularity Check

0 steps flagged

No significant circularity; architecture evaluated on external benchmarks

full rationale

The paper introduces a new neural architecture (DAG) with Temporal and Channel Correlation Modules, each containing discovery and injection submodules, to address shortcomings in TSF-X methods. The central claims concern improved forecasting performance when future exogenous variables are available, supported by descriptions of the modules, motivation for separating temporal and channel axes, and empirical results including ablations on standard benchmarks. No load-bearing step reduces a prediction or result to its own inputs by construction, self-definition, or self-citation chain; the model is a proposed design whose value is assessed externally rather than tautologically.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the effectiveness of the proposed correlation discovery and injection submodules, which are introduced as novel components rather than derived from first principles or external benchmarks.

axioms (1)
  • domain assumption Neural network modules can discover and inject useful correlations between endogenous and exogenous variables from data.
    The method depends on the submodules successfully capturing the intended relationships without additional justification in the abstract.

pith-pipeline@v0.9.0 · 5735 in / 1100 out tokens · 44716 ms · 2026-05-18T15:26:18.071509+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. What If We Let Forecasting Forget? A Sparse Bottleneck for Cross-Variable Dependencies

    cs.LG 2026-05 unverdicted novelty 6.0

    MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.

  2. Hermes: A Multi-Scale Spatial-Temporal Hypergraph Network for Stock Time Series Forecasting

    cs.LG 2025-09 unverdicted novelty 4.0

    Hermes is a multi-scale spatial-temporal hypergraph network that improves stock forecasting accuracy by capturing inter-industry lead-lag dependencies and fusing information across scales.

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages · cited by 2 Pith papers

  1. [1]

    Francisco Martinez Alvarez, Alicia Troncoso, Jose C Riquelme, and Jesus S Aguilar Ruiz. 2011. Energy Time Series Forecasting Based on Pattern Sequence Similarity. IEEE Trans. Knowl. Data Eng.23, 8 (2011), 1230–1243

  2. [2]

    George EP Box and David A Pierce. 1970. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models.Journal of the American statistical Association65, 332 (1970), 1509–1526

  3. [3]

    Leo Breiman. 2001. Random forests.Machine learning45 (2001), 5–32

  4. [4]

    David Campos, Bin Yang, Tung Kieu, Miao Zhang, Chenjuan Guo, and Chris- tian S Jensen. 2024. QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models–Extended Version.arXiv preprint arXiv:2404.13990(2024)

  5. [6]

    Razvan-Gabriel Cirstea, Chenjuan Guo, Bin Yang, Tung Kieu, Xuanyi Dong, and Shirui Pan. 2022. Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting. InIJCAI. 1994–2001

  6. [7]

    Tao Dai, Beiliang Wu, Peiyuan Liu, Naiqi Li, Jigang Bao, Yong Jiang, and Shu-Tao Xia. 2024. Periodicity Decoupling Framework for Long-term Series Forecasting. International Conference on Learning Representations(2024)

  7. [8]

    Abhimanyu Das, Weihao Kong, Andrew Leach, Shaan Mathur, Rajat Sen, and Rose Yu. 2023. Long-term forecasting with tide: Time-series dense encoder.arXiv preprint arXiv:2304.08424(2023)

  8. [9]

    Olivares Federico Garza, Max Mergenthaler Canseco

    Cristian Challú Kin G. Olivares Federico Garza, Max Mergenthaler Canseco. 2022. StatsForecast: Lightning fast forecasting with statistical and econometric models. PyCon Salt Lake City, Utah, US 2022

  9. [10]

    Jingru Fei, Kun Yi, Wei Fan, Qi Zhang, and Zhendong Niu. 2025. Amplifier: Bring- ing Attention to Neglected Low-Energy Components in Time Series Forecasting. InAAAI, Vol. 39. 11645–11653

  10. [11]

    Rakshitha Godahewa, Christoph Bergmeir, Geoffrey I Webb, Rob J Hyndman, and Pablo Montero-Manso. 2021. Monash time series forecasting archive.arXiv preprint arXiv:2105.06643(2021)

  11. [12]

    Chenjuan Guo, Bin Yang, Ove Andersen, Christian S Jensen, and Kristian Torp

  12. [13]

    Ecomark 2.0: empowering eco-routing with vehicular environmental models and actual vehicle fuel consumption data.GeoInformatica19 (2015), 567–599

  13. [14]

    Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, et al. 2020. The ERA5 global reanalysis.Quarterly journal of the royal meteorolog- ical society146, 730 (2020), 1999–2049

  14. [15]

    Songtao Huang, Zhen Zhao, Can Li, and Lei Bai. 2025. Timekan: Kan-based fre- quency decomposition learning architecture for long-term time series forecasting. InICLR

  15. [16]

    Xuanwen Huang, Yang Yang, Yang Wang, Chunping Wang, Zhisheng Zhang, Jiarong Xu, Lei Chen, and Michalis Vazirgiannis. 2022. DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection. InNeurIPS. 22765–22777

  16. [17]

    2008.Forecasting with exponential smoothing: the state space approach

    Rob Hyndman, Anne B Koehler, J Keith Ord, and Ralph D Snyder. 2008.Forecasting with exponential smoothing: the state space approach

  17. [18]

    Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree.Advances in Neural Information Processing Systems30 (2017)

  18. [19]

    Jensen, Yan Zhao, Feiteng Huang, and Kai Zheng

    Tung Kieu, Bin Yang, Chenjuan Guo, Christian S. Jensen, Yan Zhao, Feiteng Huang, and Kai Zheng. 2022. Robust and Explainable Autoencoders for Unsuper- vised Time Series Outlier Detection. InICDE. 3038–3050

  19. [20]

    Jesus Lago, Grzegorz Marcjasz, Bart De Schutter, and Rafał Weron. 2021. Fore- casting day-ahead electricity prices: A review of state-of-the-art algorithms, best practices and an open-access benchmark.Applied Energy293 (2021), 116983

  20. [21]

    Chonho Lee, Zhaojing Luo, Kee Yuan Ngiam, Meihui Zhang, Kaiping Zheng, Gang Chen, Beng Chin Ooi, and Wei Luen James Yip. 2017. Big healthcare data analytics: Challenges and applications.Handbook of large-scale distributed computing in smart healthcare(2017), 11–41

  21. [22]

    Leyang Li, Shilin Lu, Yan Ren, and Adams Wai-Kin Kong. 2025. Set you straight: Auto-steering denoising trajectories to sidestep unwanted concepts.arXiv preprint arXiv:2504.12782(2025)

  22. [23]

    Xiang Li, Yangfan He, Shuaishuai Zu, Zhengyang Li, Tianyu Shi, Yiting Xie, and Kevin Zhang. 2025. Multi-Modal Large Language Model with RAG Strategies in Soccer Commentary Generation. InW ACV. 6197–6206

  23. [24]

    Zhe Li, Xiangfei Qiu, Peng Chen, Yihang Wang, Hanyin Cheng, Yang Shu, Jilin Hu, Chenjuan Guo, Aoying Zhou, Qingsong Wen, et al. 2025. TSFM-Bench: A Comprehensive and Unified Benchmark of Foundation Models for Time Series Forecasting. InSIGKDD

  24. [25]

    Bryan Lim, Sercan Ö Arık, Nicolas Loeff, and Tomas Pfister. 2021. Temporal fusion transformers for interpretable multi-horizon time series forecasting. 37, 4 (2021), 1748–1764

  25. [26]

    Shengsheng Lin, Weiwei Lin, Wentai Wu, Haojun Chen, and Junjie Yang. 2024. SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters. In ICML. 30211–30226

  26. [27]

    Shengsheng Lin, Weiwei Lin, HU Xinyi, Wentai Wu, Ruichao Mo, and Haocheng Zhong. 2024. CycleNet: Enhancing Time Series Forecasting through Modeling Periodic Patterns. InNeurIPS

  27. [28]

    Yan Lin, Jilin Hu, Shengnan Guo, Bin Yang, Christian S Jensen, Youfang Lin, and Huaiyu Wan. 2024. GenSTL: General Sparse Trajectory Learning via Auto- regressive Generation of Feature Domains.arXiv preprint arXiv:2402.07232(2024)

  28. [29]

    Peiyuan Liu, Hang Guo, Tao Dai, Naiqi Li, Jigang Bao, Xudong Ren, Yong Jiang, and Shu-Tao Xia. 2025. CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning.AAAI39, 18 (2025), 18915–18923

  29. [30]

    Peiyuan Liu, Beiliang Wu, Yifan Hu, Naiqi Li, Tao Dai, Jigang Bao, and Shu-Tao Xia. 2025. TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting.ICML(2025)

  30. [31]

    Xvyuan Liu, Xiangfei Qiu, Xingjian Wu, Zhengyu Li, Chenjuan Guo, Jilin Hu, and Bin Yang. 2025. Rethinking Irregular Time Series Forecasting: A Simple yet Effective Baseline.arXiv preprint arXiv:2505.11250(2025)

  31. [32]

    Shilin Lu, Yanzhu Liu, and Adams Wai-Kin Kong. 2023. Tf-icon: Diffusion-based training-free cross-domain image composition. InICCV. 2294–2305

  32. [33]

    Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, and Adams Wai-Kin Kong. 2024. Mace: Mass concept erasure in diffusion models. InCVPR. 6430–6440

  33. [34]

    Shilin Lu, Zihan Zhou, Jiayou Lu, Yuanzhi Zhu, and Adams Wai-Kin Kong. 2024. Robust watermarking using generative priors against image editing: From bench- marking to advances.arXiv preprint arXiv:2410.18775(2024)

  34. [35]

    Xiaoye Miao, Yangyang Wu, Jun Wang, Yunjun Gao, Xudong Mao, and Jianwei Yin. 2021. Generative semi-supervised learning for multivariate time series imputation. InAAAI, Vol. 35. 8983–8991

  35. [36]

    Xian Mo, Jun Pang, and Zhiming Liu. 2022. THS-GWNN: a deep learning frame- work for temporal network link prediction.Frontiers of Computer Science16, 2 (2022), 162304

  36. [37]

    Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam

    Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In ICLR

  37. [38]

    Kin G Olivares, Cristian Challu, Grzegorz Marcjasz, Rafał Weron, and Artur Dubrawski. 2023. Neural basis expansion analysis with exogenous variables: Forecasting electricity prices with NBEATSx. 39, 2 (2023), 884–900

  38. [39]

    Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio

    Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2020. N- BEATS: Neural basis expansion analysis for interpretable time series forecasting. InICLR

  39. [40]

    Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, H...

  40. [41]

    Xuecheng Qi, Huiqi Hu, Jinwei Guo, Chenchen Huang, Xuan Zhou, Ning Xu, Yu Fu, and Aoying Zhou. 2023. High-availability in-memory key-value store using RDMA and Optane DCPMM.Frontiers of Computer Science17, 1 (2023), 171603

  41. [42]

    Xiangfei Qiu, Hanyin Cheng, Xingjian Wu, Jilin Hu, and Chenjuan Guo. 2025. A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecast- ing: A Channel Strategy Perspective.arXiv preprint arXiv:2502.10721(2025)

  42. [43]

    Jensen, Zhenli Sheng, and Bin Yang

    Xiangfei Qiu, Jilin Hu, Lekui Zhou, Xingjian Wu, Junyang Du, Buang Zhang, Chenjuan Guo, Aoying Zhou, Christian S. Jensen, Zhenli Sheng, and Bin Yang

  43. [44]

    VLDB Endow.17, 9 (2024), 2363–2377

    TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods.Proc. VLDB Endow.17, 9 (2024), 2363–2377

  44. [45]

    Jensen, and Bin Yang

    Xiangfei Qiu, Xiuwen Li, Ruiyang Pang, Zhicheng Pan, Xingjian Wu, Liu Yang, Jilin Hu, Yang Shu, Xuesong Lu, Chengcheng Yang, Chenjuan Guo, Aoying Zhou, Christian S. Jensen, and Bin Yang. 2025. EasyTime: Time Series Forecasting Made Easy. InICDE

  45. [46]

    Jensen, and Bin Yang

    Xiangfei Qiu, Zhe Li, Wanghui Qiu, Shiyan Hu, Lekui Zhou, Xingjian Wu, Zhengyu Li, Chenjuan Guo, Aoying Zhou, Zhenli Sheng, Jilin Hu, Christian S. Jensen, and Bin Yang. 2025. TAB: Unified Benchmarking of Time Series Anomaly Detection Methods. InProc. VLDB Endow

  46. [47]

    Xiangfei Qiu, Xingjian Wu, Yan Lin, Chenjuan Guo, Jilin Hu, and Bin Yang

  47. [48]

    In SIGKDD

    DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting. In SIGKDD. 1185–1196

  48. [49]

    David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks.Inter- national journal of forecasting36, 3 (2020), 1181–1191

  49. [50]

    Omer Berat Sezer, Mehmet Ugur Gudelek, and Ahmet Murat Özbayoglu. 2020. Financial time series forecasting with deep learning : A systematic literature review: 2005-2019.Appl. Soft Comput.90 (2020), 106181

  50. [51]

    Artyom Stitsyuk and Jaesik Choi. 2025. xPatch: Dual-Stream Time Series Fore- casting with Exponential Seasonal-Trend Decomposition. InAAAI, Vol. 39. 20601– 20609

  51. [52]

    Chenchen Sun, Yan Ning, Derong Shen, and Tiezheng Nie. 2023. Graph Neural Network-Based Short-Term Load Forecasting with Temporal Convolution.Data Science and Engineering(2023), 1–20

  52. [53]

    Kshitij Tayal, Arvind Renganathan, Xiaowei Jia, Vipin Kumar, and Dan Lu. 2024. ExoTST: Exogenous-Aware Temporal Sequence Transformer for Time Series Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Xiangfei Qiu et al. Prediction. InICDM. 857–862

  53. [54]

    Luan Tran, Manh Nguyen, and Cyrus Shahabi. 2019. Representation learning for early sepsis prediction. In2019 Computing in Cardiology (CinC). 1–4

  54. [55]

    Stylianos I Vagropoulos, GI Chouliaras, Evaggelos G Kardakos, Christos K Simoglou, and Anastasios G Bakirtzis. 2016. Comparison of SARIMAX, SARIMA, modified SARIMA and ANN-based models for short-term PV generation fore- casting. InENERGYCON. 1–6

  55. [56]

    Feng Wan, Linsen Li, Ke Wang, Lu Chen, Yunjun Gao, Weihao Jiang, and Shiliang Pu. 2022. MTTPRE: a multi-scale spatial-temporal model for travel time prediction. InSIGSPATIAL. 1–10

  56. [57]

    Jiaqi Wang, Tianyi Li, Anni Wang, Xiaoze Liu, Lu Chen, Jie Chen, Jianye Liu, Junyang Wu, Feifei Li, and Yunjun Gao. 2023. Real-time Workload Pattern Analysis for Large-scale Cloud Databases.arXiv preprint arXiv:2307.02626(2023)

  57. [58]

    Ruoyu Wang, Yangfan He, Tengjiao Sun, Xiang Li, and Tianyu Shi. 2025. UniT- MGE: Uniform Text-Motion Generation and Editing Model via Diffusion. In W ACV. 6104–6114

  58. [59]

    Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Guo Qin, Haoran Zhang, Yong Liu, Yunzhong Qiu, Jianmin Wang, and Mingsheng Long. 2024. Timexer: Empowering transformers for time series forecasting with exogenous variables. InNeurIPS, Vol. 37. 469–498

  59. [60]

    Kaimin Wei, Tianqi Li, Feiran Huang, Jinpeng Chen, and Zefan He. 2022. Cancer classification with data augmentation based on generative adversarial networks. Frontiers of Computer Science16 (2022), 1–11

  60. [61]

    Billy M Williams. 2001. Multivariate vehicular traffic flow prediction: Evaluation of ARIMAX modeling. 1776, 1 (2001), 194–200

  61. [62]

    Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. 2021. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Fore- casting. InNeurIPS. 22419–22430

  62. [63]

    Xingjian Wu, Xiangfei Qiu, Hongfan Gao, Jilin Hu, Bin Yang, and Chenjuan Guo. 2025. K 2VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting. InICML

  63. [64]

    Xingjian Wu, Xiangfei Qiu, Zhengyu Li, Yihang Wang, Jilin Hu, Chenjuan Guo, Hui Xiong, and Bin Yang. 2025. CATCH: Channel-Aware multivariate Time Series Anomaly Detection via Frequency Patching. InICLR

  64. [65]

    Xinle Wu, Xingjian Wu, Bin Yang, Lekui Zhou, Chenjuan Guo, Xiangfei Qiu, Jilin Hu, Zhenli Sheng, and Christian S. Jensen. 2024. AutoCTS++: zero-shot joint neural architecture and hyperparameter search for correlated time series forecasting.VLDB J.33, 5 (2024), 1743–1770

  65. [66]

    Ronghui Xu, Meng Chen, Yongshun Gong, Yang Liu, Xiaohui Yu, and Liqiang Nie. 2023. TME: Tree-guided Multi-task Embedding Learning towards Semantic Venue Annotation.ACM Transactions on Information Systems41, 4 (2023), 1–24

  66. [67]

    Chen Yang, Yangfan He, Aaron Xuxiang Tian, Dong Chen, Jianhui Wang, Tianyu Shi, Arsalan Heydarian, and Pei Liu. 2024. Wcdt: World-centric diffusion trans- former for traffic scene generation.arXiv preprint arXiv:2404.02082(2024)

  67. [68]

    Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. 2023. Are transformers effective for time series forecasting?. InAAAI, Vol. 37. 11121–11128

  68. [69]

    Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond efficient transformer for long se- quence time-series forecasting. InAAAI, Vol. 35. 11106–11115

  69. [70]

    Pengfei Zhou, Yunlong Liu, Junli Liang, Qi Song, and Xiangyang Li. 2025. CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Fore- casting with Exogenous Variables. InSIGKDD

  70. [71]

    Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. 2022. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. InICML. 27268–27286

  71. [72]

    Yiyang Zhou, Yangfan He, Yaofeng Su, Siwei Han, Joel Jang, Gedas Bertasius, Mohit Bansal, and Huaxiu Yao. 2025. ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding.arXiv preprint arXiv:2506.01300(2025). DAG: A Dual Causal Network for Time Series Forecasting with Exogenous Variables Conference acronym ’XX, June 03–05, 2018, Woodstock, ...

  72. [73]

    To keep consistent with previous works, we adopt Mean Squared Error (mse) and Mean Absolute Error (mae) as evaluation metrics

  73. [74]

    Drop Last

    We consider four forecasting horizon 𝐹: 96, 192, 336, and 720 for the eight common multivariate forecasting datasets. The look-back windows is fixed at 96 for all the baselines. 3) For the 12 real-world datasets that satisfy the TSF-X conditions, we conduct both long- term and short-term prediction experiments. For the Colbun and Raperl datasets, the shor...