pith. machine review for the scientific record. sign in

arxiv: 2604.23239 · v1 · submitted 2026-04-25 · 💻 cs.AI

Recognition: unknown

AdaMamba: Adaptive Frequency-Gated Mamba for Long-Term Time Series Forecasting

Authors on Pith no claims yet

Pith reviewed 2026-05-08 08:13 UTC · model grok-4.3

classification 💻 cs.AI
keywords long-term time series forecastingMambafrequency domainadaptive gatingstate-space modelsmultivariate time seriestime-frequency integration
0
0 comments X

The pith

AdaMamba adds input-dependent frequency bases to Mamba state updates so the model can adapt to frequency differences across variables in long time series.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to show that real-world time series often hide important periodic differences across variables that only become visible in the frequency domain. By generating those frequency bases inside the Mamba update itself and folding them into a single time-frequency forgetting gate, the method lets the state-space model recalibrate its memory on the fly for each input. If the approach works, forecasters gain accuracy on heterogeneous data while keeping the linear-time scaling that makes Mamba attractive for long horizons.

Core claim

AdaMamba endogenizes adaptive frequency analysis inside the Mamba state-space process. An interactive patch encoder first models inter-variable dynamics; then an adaptive frequency-gated module produces input-dependent frequency bases and replaces the standard temporal gate with a unified time-frequency gate. This lets the model scale state transitions according to learned frequency importance while retaining Mamba's long-range dependency modeling. On seven public LTSF benchmarks and two domain-specific sets the method records higher accuracy than prior state-of-the-art approaches at comparable computational cost.

What carries the argument

The adaptive frequency-gated state-space module that creates input-dependent frequency bases on the fly and merges them into a single time-frequency forgetting gate inside each Mamba block.

If this is right

  • Forecast accuracy rises on multivariate series whose variables differ in dominant frequencies even when they appear aligned in the time domain.
  • Computational cost stays close to standard Mamba because the frequency adaptation is folded into the existing state update rather than added as a separate module.
  • The same architecture can be applied to domain-specific series without requiring hand-crafted frequency features for each new domain.
  • Long-range dependency modeling remains intact because the frequency gate operates alongside rather than replacing the original Mamba recurrence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same input-dependent frequency mechanism could be dropped into other linear-time state-space architectures to test whether the gain is specific to Mamba or general to the state-space family.
  • If the adaptive bases prove stable, the method offers a route to remove separate frequency preprocessing steps that currently sit outside most deep forecasters.
  • On very high-dimensional series the interactive patch encoder may become a bottleneck, suggesting a natural next test of whether the frequency gating alone is sufficient when variable count grows.

Load-bearing premise

Real-world time series contain enough measurable frequency heterogeneity across variables that an input-dependent basis generated inside the Mamba update can exploit it without causing extra overfitting or training instability.

What would settle it

If an ablation that replaces the learned input-dependent frequency bases with fixed, non-adaptive bases produces equal or better accuracy on the same seven benchmarks, the value of the adaptive component is falsified.

Figures

Figures reproduced from arXiv: 2604.23239 by Hanchen Yang, Jihong Guan, Mingrui Zhang, Mingshan Loo, Shuigeng Zhou, Wengen Li, Xudong Jiang, Yichao Zhang.

Figure 1
Figure 1. Figure 1: A motivating example of the ETTm1 dataset in view at source ↗
Figure 2
Figure 2. Figure 2: The architecture of AdaMamba which mainly consists of two modules, i.e., interactive patch encoding module and view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of forecasting results. 4.3 Ablation Studies To evaluate the contributions of the main components in AdaMamba , we conduct ablation studies by constructing multiple variants as listed below. Case I: The full AdaMamba that includes both the interactive patch encoding module and the adaptive frequency-gated state￾space module. Case II: The interactive patch encoding module in AdaMamba is replac… view at source ↗
Figure 4
Figure 4. Figure 4: The visualization of frequency analysis. view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of AdaMamba and six baseline models view at source ↗
Figure 6
Figure 6. Figure 6: Performance gains and computational overhead of view at source ↗
Figure 8
Figure 8. Figure 8: Sensitivity analysis of multi-scale patch. view at source ↗
Figure 7
Figure 7. Figure 7: Analysis of the hyperparameter sensitivity. view at source ↗
read the original abstract

Accurate long-term time series forecasting (LTSF) requires the capture of complex long-range dependencies and dynamic periodic patterns. Recent advances in frequency-domain analysis offer a global perspective for uncovering temporal characteristics. However, real-world time series often exhibit pronounced cross-domain heterogeneity where variables that appear synchronized in the time domain can differ substantially in the frequency domain. Existing frequency-based LTSF methods often rely on implicit assumptions of cross-domain homogeneity, which limits their ability to adapt to such intricate variability. To effectively integrate frequency-domain analysis with temporal dependency learning, we propose AdaMamba, a novel framework that endogenizes adaptive and context-aware frequency analysis within the Mamba state-space update process. Specifically, AdaMamba introduces an interactive patch encoding module to capture inter-variable interaction dynamics. Then, we develop an adaptive frequency-gated state-space module that generates input-dependent frequency bases, and generalizes the conventional temporal forgetting gate into a unified time-frequency forgetting gate. This allows dynamic calibration of state transitions based on learned frequency-domain importance, while preserving Mamba's capability in modeling long-range dependencies. Extensive experiments on seven public LTSF benchmarks and two domain-specific datasets demonstrate that AdaMamba consistently outperforms state-of-the-art methods in forecasting accu racy while maintaining competitive computational efficiency. The code of AdaMamba is available at https://github.com/XDjiang25/AdaMamba.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes AdaMamba, a Mamba-based model for long-term time series forecasting that adds an interactive patch encoding module for inter-variable dynamics and an adaptive frequency-gated state-space module. The latter generates input-dependent frequency bases and generalizes the temporal forgetting gate into a unified time-frequency gate to dynamically adjust state transitions according to learned frequency importance while aiming to retain long-range dependency modeling. Experiments on seven public LTSF benchmarks plus two domain-specific datasets are reported to show consistent accuracy gains over state-of-the-art methods at competitive computational cost.

Significance. If the adaptive frequency integration can be shown to preserve the selective SSM stability and discretization properties while capturing cross-domain frequency heterogeneity, the approach would offer a concrete mechanism for hybrid time-frequency modeling in forecasting tasks where periodic patterns vary across variables or domains. The public code release supports reproducibility.

major comments (2)
  1. [Abstract / adaptive frequency-gated state-space module description] The description of the adaptive frequency-gated state-space module (abstract and method section) states that the conventional temporal forgetting gate is generalized to a unified time-frequency forgetting gate via input-dependent frequency bases. No explicit equation is supplied showing how the frequency component enters the state matrix A, input matrix B, or the discretization step of the underlying SSM. Without this, it is impossible to verify that the selective long-range dynamics and stability guarantees of the original Mamba are retained; any unanalyzed time-frequency coupling would directly undermine the central performance claim.
  2. [Experiments section] The experimental claims of consistent outperformance rest on benchmark results, yet the manuscript provides no details on the number of independent runs, statistical significance tests, error bars, or ablation studies that isolate the contribution of the input-dependent frequency bases versus the base Mamba architecture. This absence makes it difficult to rule out that observed gains arise from hyper-parameter tuning or other unanalyzed factors rather than the proposed adaptive mechanism.
minor comments (2)
  1. [Abstract] The abstract contains a typographical spacing error: 'accu racy' should read 'accuracy'.
  2. [Method] Notation for the unified time-frequency gate and frequency bases should be introduced with a clear symbol table or inline definitions to avoid ambiguity when the equations are eventually supplied.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough and constructive review. We address each major comment below and will revise the manuscript accordingly to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Abstract / adaptive frequency-gated state-space module description] The description of the adaptive frequency-gated state-space module (abstract and method section) states that the conventional temporal forgetting gate is generalized to a unified time-frequency forgetting gate via input-dependent frequency bases. No explicit equation is supplied showing how the frequency component enters the state matrix A, input matrix B, or the discretization step of the underlying SSM. Without this, it is impossible to verify that the selective long-range dynamics and stability guarantees of the original Mamba are retained; any unanalyzed time-frequency coupling would directly undermine the central performance claim.

    Authors: We agree that the current description is insufficient for verifying the integration details. The manuscript describes the adaptive frequency-gated state-space module at a conceptual level but does not supply the explicit equations for how input-dependent frequency bases modify the state matrix A, input matrix B, or the discretization step. In the revised version, we will add these precise formulations (including the modified SSM update rules and the unified time-frequency gate) to the method section, along with a brief analysis confirming that selective long-range dynamics and stability properties are preserved. revision: yes

  2. Referee: [Experiments section] The experimental claims of consistent outperformance rest on benchmark results, yet the manuscript provides no details on the number of independent runs, statistical significance tests, error bars, or ablation studies that isolate the contribution of the input-dependent frequency bases versus the base Mamba architecture. This absence makes it difficult to rule out that observed gains arise from hyper-parameter tuning or other unanalyzed factors rather than the proposed adaptive mechanism.

    Authors: We acknowledge this gap in experimental reporting. The manuscript presents results across the benchmarks but omits the number of runs, error bars, significance tests, and ablations isolating the frequency bases. We will revise the Experiments section to include: five independent runs per model with reported means and standard deviations; paired t-tests for statistical significance against baselines; and a dedicated ablation comparing the full model to a non-adaptive Mamba variant. These changes will substantiate the contribution of the proposed adaptive mechanism. revision: yes

Circularity Check

0 steps flagged

No circularity: novel adaptive module with independent empirical validation

full rationale

The paper proposes AdaMamba as an extension of Mamba SSMs that introduces an interactive patch encoding module and an adaptive frequency-gated state-space module generating input-dependent frequency bases to create a unified time-frequency forgetting gate. These are presented as architectural innovations with explicit design goals (capturing cross-domain frequency heterogeneity while preserving long-range dependency modeling). The central claims of outperformance rest on experimental results across seven LTSF benchmarks and two domain-specific datasets, not on any definitional equivalence, fitted-parameter renaming, or self-citation chain that reduces the result to its inputs. No load-bearing step equates a derived quantity to a fitted input or prior self-cited result by construction; the derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

The central performance claim rests on the effectiveness of two newly introduced modules whose behavior is learned from data; no explicit free parameters, axioms, or invented physical entities are stated in the abstract.

invented entities (1)
  • adaptive frequency-gated state-space module no independent evidence
    purpose: Generates input-dependent frequency bases and unifies time-frequency forgetting to calibrate state transitions
    Core novel component introduced to address cross-domain frequency heterogeneity

pith-pipeline@v0.9.0 · 5572 in / 1100 out tokens · 78650 ms · 2026-05-08T08:13:48.099457+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

62 extracted references · 14 canonical work pages · 2 internal anchors

  1. [1]

    Md Atik Ahamed and Qiang Cheng. 2024. Timemachine: A time series is worth 4 mambas for long-term forecasting. InECAI 2024. IOS Press, 1688–1695

  2. [2]

    Anastasia Borovykh, Sander Bohte, and Cornelis W Oosterlee. 2017. Condi- tional time series forecasting with convolutional neural networks.arXiv preprint arXiv:1703.04691(2017)

  3. [3]

    Jian Cao, Zhi Li, and Jian Li. 2019. Financial time series forecasting model based on CEEMDAN and LSTM.Physica A: Statistical mechanics and its applications 519 (2019), 127–139

  4. [4]

    Zonglei Chen, Minbo Ma, Tianrui Li, Hongjun Wang, and Chongshou Li. 2023. Long sequence time-series forecasting with deep learning: A survey.Information Fusion97 (2023), 101819

  5. [5]

    Mingyue Cheng, Jiqian Yang, Tingyue Pan, Qi Liu, Zhi Li, and Shijin Wang. 2025. Convtimenet: A deep hierarchical fully convolutional model for multivariate time series analysis. InCompanion Proceedings of the ACM on Web Conference

  6. [6]

    Filip Elvander and Andreas Jakobsson. 2020. Defining fundamental frequency for almost harmonic signals.IEEE Transactions on Signal Processing68 (2020), 6453–6466

  7. [7]

    Albert Gu and Tri Dao. 2024. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv:2312.00752 [cs.LG] https://arxiv.org/abs/2312.00752

  8. [8]

    Siu Lau Ho and Min Xie. 1998. The use of ARIMA models for reliability forecasting and analysis.Computers & industrial engineering35, 1-2 (1998), 213–216

  9. [9]

    Yuntong Hu and Fuyuan Xiao. 2022. Network self attention for forecasting time series.Applied Soft Computing124 (2022), 109092

  10. [10]

    Norden E Huang, Zheng Shen, Steven R Long, Manli C Wu, Hsing H Shih, Quanan Zheng, Nai-Chyuan Yen, Chi Chao Tung, and Henry H Liu. 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non- stationary time series analysis.Proceedings of the Royal Society of London. Series A: mathematical, physical and engineering sciences454, 1...

  11. [11]

    Siteng Huang, Donglin Wang, Xuehan Wu, and Ao Tang. 2019. Dsanet: Dual self-attention network for multivariate time series forecasting. InProceedings of the 28th ACM international conference on information and knowledge management. 2129–2132

  12. [12]

    Songtao Huang, Zhen Zhao, Can Li, and Lei Bai. 2025. Timekan: Kan-based fre- quency decomposition learning architecture for long-term time series forecasting. arXiv preprint arXiv:2502.06910(2025)

  13. [13]

    Xudong Jiang, Yunfan Liu, Shuyu Wang, Wengen Li, and Jihong Guan. 2025. Spatiotemporal Attention Network for Chl-a Prediction With Sparse Multifactor Observations.IEEE Geoscience and Remote Sensing Letters22 (2025), 1–5. doi:10. 1109/LGRS.2025.3563458

  14. [14]

    Xudong Jiang, Shuyu Wang, Wengen Li, Hanchen Yang, Jihong Guan, Yichao Zhang, and Shuigeng Zhou. 2025. STDMamba: Spatiotemporal Decomposition Mamba for Long-Term Fine-Grained SST Prediction.IEEE Transactions on Geo- science and Remote Sensing63 (2025), 1–16. doi:10.1109/TGRS.2025.3624051

  15. [15]

    Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems30 (2017)

  16. [16]

    Ruiqi Li, Maowei Jiang, Quangao Liu, Kai Wang, Kaiduo Feng, Yue Sun, and Xiufang Zhou. 2025. FAITH: Frequency-domain Attention In Two Horizons for time series forecasting.Knowledge-Based Systems309 (2025), 112790

  17. [17]

    Yan Li, Xinjiang Lu, Haoyi Xiong, Jian Tang, Jiantao Su, Bo Jin, and Dejing Dou. 2023. Towards long-term time-series forecasting: Feature, pattern, and distribution. In2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 1611–1624

  18. [18]

    Shengsheng Lin, Weiwei Lin, Wentai Wu, Feiyu Zhao, Ruichao Mo, and Haotong Zhang. 2023. Segrnn: Segment recurrent neural network for long-term time series forecasting.arXiv preprint arXiv:2308.11200(2023)

  19. [19]

    Shizhan Liu, Hang Yu, Cong Liao, Jianguo Li, Weiyao Lin, Alex X Liu, and Schahram Dustdar. 2022. Pyraformer: Low-complexity pyramidal at- tention for long-range time series modeling and forecasting. In# PLACE- HOLDER_PARENT_METADATA_V ALUE#

  20. [20]

    Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. 2024. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. InThe Twelfth International Conference on Learning Representations

  21. [21]

    Donghao Luo and Xue Wang. 2024. Moderntcn: A modern pure convolution structure for general time series analysis. InThe twelfth international conference on learning representations. 1–43

  22. [22]

    Soheila Mehrmolaei and Mohammad Reza Keyvanpour. 2016. Time series fore- casting using improved ARIMA. In2016 Artificial Intelligence and Robotics (IRA- NOPEN). IEEE, 92–97

  23. [23]

    Md Mahmuddun Nabi Murad, Mehmet Aktukmak, and Yasin Yilmaz. 2025. Wp- mixer: Efficient multi-resolution mixing for long-term time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 19581–19588

  24. [24]

    Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In The Eleventh International Conference on Learning Representations

  25. [26]

    Xihao Piao, Zheng Chen, Taichi Murayama, Yasuko Matsubara, and Yasushi Saku- rai. 2024. Fredformer: Frequency debiased transformer for time series forecasting. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining. 2400–2410

  26. [27]

    Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, and Garrison Cottrell. 2017. A dual-stage attention-based recurrent neural network for time series prediction.arXiv preprint arXiv:1704.02971(2017)

  27. [28]

    Alaa Sagheer and Mostafa Kotb. 2019. Time series forecasting of petroleum production using deep LSTM recurrent networks.Neurocomputing323 (2019), 203–213

  28. [29]

    Bibhuti Bhusan Sahoo, Ramakar Jha, Anshuman Singh, and Deepak Kumar

  29. [30]

    hydrological time series forecasting.Acta Geophysica67, 5 (2019), 1471–1481

    Long short-term memory (LSTM) recurrent neural network for low-flow 9 Conference acronym ’XX, June 03–05, 2026, Woodstock, NY Xudong Jiang et al. hydrological time series forecasting.Acta Geophysica67, 5 (2019), 1471–1481

  30. [31]

    Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2018. A comparison of ARIMA and LSTM in forecasting time series. In2018 17th IEEE international conference on machine learning and applications (ICMLA). Ieee, 1394–1401

  31. [32]

    Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2019. The perfor- mance of LSTM and BiLSTM in forecasting time series. In2019 IEEE International conference on big data (Big Data). IEEE, 3285–3292

  32. [33]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.Advances in neural information processing systems30 (2017)

  33. [34]

    Renzhuo Wan, Shuping Mei, Jun Wang, Min Liu, and Fan Yang. 2019. Multi- variate temporal convolutional network: A deep neural networks approach for multivariate time series forecasting.Electronics8, 8 (2019), 876

  34. [35]

    Haoxin Wang, Yipeng Mo, Kunlan Xiang, Nan Yin, Honghe Dai, Bixiong Li, and Songhai Fan. 2025. CSformer: Combining Channel Independence and Mixing for Robust Multivariate Time Series Forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 21090–21098

  35. [36]

    Hao Wang, Licheng Pan, Zhichao Chen, Degui Yang, Sen Zhang, Yifei Yang, Xinggao Liu, Haoxuan Li, and Dacheng Tao. 2024. Fredf: Learning to forecast in the frequency domain.arXiv preprint arXiv:2402.02399(2024)

  36. [37]

    Huiqiang Wang, Jian Peng, Feihu Huang, Jince Wang, Junhui Chen, and Yifei Xiao

  37. [38]

    InThe eleventh international conference on learning representations

    Micn: Multi-scale local and global context modeling for long-term series forecasting. InThe eleventh international conference on learning representations

  38. [39]

    Shuo Wang, Yun Cheng, Qingye Meng, Olga Saukh, Jiang Zhang, Jingfang Fan, Yuanting Zhang, Xingyuan Yuan, and Lothar Thiele. 2025. KnowAir-V2: A Benchmark Dataset for Air Quality Forecasting with PCDCNet. (2025)

  39. [40]

    Yulong Wang, Yushuo Liu, Xiaoyi Duan, and Kai Wang. 2025. Filterts: Compre- hensive frequency filtering for multivariate time series forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 21375–21383

  40. [41]

    Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Guo Qin, Haoran Zhang, Yong Liu, Yunzhong Qiu, Jianmin Wang, and Mingsheng Long. 2024. Timexer: Empowering transformers for time series forecasting with exogenous variables.Advances in Neural Information Processing Systems37 (2024), 469–498

  41. [42]

    Zihan Wang, Fanheng Kong, Shi Feng, Ming Wang, Xiaocui Yang, Han Zhao, Dal- ing Wang, and Yifei Zhang. 2025. Is mamba effective for time series forecasting? Neurocomputing619 (2025), 129178

  42. [43]

    Haixu Wu, Tengge Hu, Yong Liu, Hang Zhou, Jianmin Wang, and Mingsheng Long. 2022. Timesnet: Temporal 2d-variation modeling for general time series analysis.arXiv preprint arXiv:2210.02186(2022)

  43. [44]

    Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. 2021. Autoformer: De- composition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems34 (2021), 22419–22430

  44. [45]

    Yuhan Wu, Xiyu Meng, Huajin Hu, Junru Zhang, Yabo Dong, and Dongming Lu

  45. [46]

    InProceedings of the AAAI Conference on Artificial Intelligence, Vol

    Affirm: Interactive mamba with adaptive fourier filters for long-term time series forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 21599–21607

  46. [47]

    Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Xiaojun Chang, and Chengqi Zhang. 2020. Connecting the dots: Multivariate time series forecasting with graph neural networks. InProceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 753–763

  47. [48]

    Hanchen Yang, Jiannong Cao, Wengen Li, Shuyu Wang, Hui Li, Jihong Guan, and Shuigeng Zhou. 2025. Spatial-temporal data mining for ocean science: Data, methodologies and opportunities.ACM Transactions on Knowledge Discovery from Data19, 7 (2025), 1–47

  48. [49]

    Hanchen Yang, Jiaqi Wang, Jiannong Cao, Wengen Li, Jialun Zheng, Yangning Li, Chunyu Miao, Jihong Guan, Shuigeng Zhou, and Philip S Yu. 2025. OKG-LLM: Aligning ocean knowledge graph with observation data via LLMs for global sea surface temperature prediction.arXiv preprint arXiv:2508.00933(2025)

  49. [50]

    Kun Yi, Qi Zhang, Wei Fan, Longbing Cao, Shoujin Wang, Hui He, Guodong Long, Liang Hu, Qingsong Wen, and Hui Xiong. 2025. A survey on deep learning based time series analysis with frequency transformation. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 6206–6215

  50. [51]

    Kun Yi, Qi Zhang, Wei Fan, Hui He, Liang Hu, Pengyang Wang, Ning An, Long- bing Cao, and Zhendong Niu. 2023. FourierGNN: Rethinking multivariate time series forecasting from a pure graph perspective.Advances in neural information processing systems36 (2023), 69638–69660

  51. [52]

    Kun Yi, Qi Zhang, Wei Fan, Shoujin Wang, Pengyang Wang, Hui He, Ning An, Defu Lian, Longbing Cao, and Zhendong Niu. 2023. Frequency-domain MLPs are more effective learners in time series forecasting.Advances in Neural Information Processing Systems36 (2023), 76656–76679

  52. [53]

    Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. 2023. Are transformers effective for time series forecasting?. InProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence (AAAI’23...

  53. [54]

    Qianru Zhang, Chenglei Yu, Haixin Wang, Yudong Yan, Yuansheng Cao, Siu-Ming Yiu, Tailin Wu, and Hongzhi Yin. 2025. Fldmamba: Integrating fourier and laplace transform decomposition with mamba for enhanced time series prediction.arXiv preprint arXiv:2507.12803(2025)

  54. [55]

    Xinyu Zhang, Shanshan Feng, Jianghong Ma, Huiwei Lin, Xutao Li, Yunming Ye, Fan Li, and Yew Soon Ong. 2024. Frnet: Frequency-based rotation network for long-term time series forecasting. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3586–3597

  55. [56]

    Yujun Zhang, Runlong Li, Xiaoxiang Liang, Xinhao Yang, Tian Su, Bo Liu, and Yan Zhou. 2025. MamNet: A Novel Hybrid Model for Time-Series Forecasting and Frequency Pattern Analysis in Network Traffic.arXiv preprint arXiv:2507.00304 (2025)

  56. [57]

    Yunhao Zhang and Junchi Yan. 2023. Crossformer: Transformer utilizing cross- dimension dependency for multivariate time series forecasting. InThe eleventh international conference on learning representations

  57. [58]

    Zhiqiang Zhang, Yuxuan Chen, Dandan Zhang, Yining Qian, and Hongbing Wang

  58. [59]

    CTFNet: Long-sequence time-series forecasting based on convolution and time–frequency analysis.IEEE Transactions on Neural Networks and Learning Systems(2023)

  59. [60]

    Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond efficient transformer for long se- quence time-series forecasting. InProceedings of the AAAI conference on artificial intelligence, Vol. 35. 11106–11115

  60. [61]

    Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. 2022. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. InInternational conference on machine learning. PMLR, 27268–27286

  61. [62]

    Ziyu Zhou, Gengyu Lyu, Yiming Huang, Zihao Wang, Ziyu Jia, and Zhen Yang

  62. [63]

    InProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI-24), Jeju, Republic of Korea

    Sdformer: transformer with spectral filter and dynamic attention for multivariate time series long-term forecasting. InProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI-24), Jeju, Republic of Korea. 3–9. 10 AdaMamba: Adaptive Frequency-Gated Mamba for Long-Term Time Series Forecasting Conference acronym ’XX, ...