pith. sign in

arxiv: 2605.00466 · v1 · submitted 2026-05-01 · 💻 cs.LG · cs.AI

PAMod: Modeling Cyclical Shifts via Phase-Amplitude Modulation for Non-stationary Time Series Forecasting

Pith reviewed 2026-05-09 19:21 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords time series forecastingnon-stationary time seriesphase-amplitude modulationdistribution shiftsperiodic embeddingsreversible instance normalizationcyclical patternsdynamic denormalization
0
0 comments X

The pith

Modulating phase and amplitude in normalized space unifies dynamic denormalization with representation learning for cyclical time series shifts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that distribution shifts in time series often follow cyclical patterns tied to periodic positions like seasons or holidays, rather than remaining constant. It introduces PAMod to learn periodic embeddings that apply phase modulation for mean changes and amplitude modulation for variance changes, all inside normalized feature space. A sympathetic reader would care because this relaxes the strong assumption in prior normalization methods that historical and future distributions stay identical. The mathematical demonstration that modulation equals dynamic denormalization unifies adaptation and learning into one lightweight step. Experiments across twelve benchmarks show the resulting forecasts improve accuracy while using fewer resources, and the mechanism works as a simple addition to other models.

Core claim

PAMod learns periodic embeddings to modulate normalized representations: phase modulation captures mean shifts while amplitude modulation adapts to variance changes. The authors prove mathematically that performing this modulation in normalized space is equivalent to applying dynamic denormalization, thereby unifying distribution adaptation and representation learning. This handles non-stationary time series where shifts correlate with periodic positions, yielding state-of-the-art results on twelve real-world benchmarks with reduced computation and serving as a plug-and-play improvement for existing forecasting methods.

What carries the argument

Phase-amplitude modulation of learned periodic embeddings performed inside normalized feature space, which simultaneously adapts mean and variance shifts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The equivalence result implies that periodic modulation could be inserted into other normalization layers without redesigning entire architectures.
  • Because the method is lightweight and plug-and-play, it may enable accurate forecasting in settings with strict compute limits such as mobile or embedded devices.
  • The same phase-amplitude idea could be tested on sequential data outside forecasting, such as periodic patterns in sensor streams or demand signals.

Load-bearing premise

Distribution shifts follow cyclical patterns that correlate with periodic positions such as seasonal and holiday volatility.

What would settle it

A counterexample disproving the claimed mathematical equivalence between normalized-space modulation and dynamic denormalization, or experiments on time series with non-cyclical shifts where PAMod shows no accuracy gain over standard normalization.

Figures

Figures reproduced from arXiv: 2605.00466 by Dejing Dou, Lemao Liu, Li Sun, Qiang Huang, Rui Qian, Shuhao Li, Yingbo Zhou, Yutong Ye.

Figure 1
Figure 1. Figure 1: Visualization of distribution shift and periodic mean-std variation in the OT channel of ETTh1. (a) histograms and density curves illustrate distinct probability distributions, with dashed lines marking the respective means in training and test sets. (b) 72-hour cyclic variations exhibit stable 24-hour repetition across three consecutive cycles (Cycle 1–3), highlighting inherent diurnal periodicity. Notabl… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of PAMod, which consists of normalization, phase-amplitude modulation with the embedding pool, MLP and denormalization for non-stationary time series forecasting. 3.2. Overview of PAMod In [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Computational efficiency under the input-96-predict-96 setting. Comparison of forecasting performance (MSE), memory usage, and training time on the ETTm1 and Traffic datasets. We set the batch sizes to 32 and 16 for the ETTm1 and Traffic datasets, respectively. 0.434 on the same dataset. This empirical result confirms that explicitly modeling non-stationarity is a more effective strategy for capturing comp… view at source ↗
Figure 4
Figure 4. Figure 4: Performance of PAMod on the ECL and PEMS07 with varying cycle length τ ∈ {12, 23, 24, 96, 144, 168, 288}. Cycle Lengths. The learnable phase and amplitude em￾beddings ϕ(τ ) and α(τ ) in our PAMod mechanism are ex￾plicitly conditioned on the cycle length τ . To investigate its critical role, we conduct a systematic study of τ across the ECL and PEMS07 datasets. As shown in [PITH_FULL_IMAGE:figures/full_fig… view at source ↗
Figure 5
Figure 5. Figure 5: Performance of different methods under varying look￾back windows T ∈ {48, 96, 192, 336, 528, 720} on the ECL dataset. The prediction horizon is fixed as 96. 7 [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of prediction and true distributions using t-SNE on the ETTh1 dataset. our phase-amplitude mechanism explicitly models temporal distribution drift, therefore enabling predictions to reside on the true data manifold. Furthermore, [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: t-SNE (Maaten & Hinton, 2008) visualization of learned phase and amplitude weights on the ETTh1 dataset. Phase-Amplitude Weights. As shown in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 9
Figure 9. Figure 9: Extra visual cases for PAMod Here, we provide more showcases for visualization in [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
read the original abstract

Real-world time series forecasting faces the fundamental challenge of non-stationary statistical properties, including shifts in mean and variance over time. While reversible instance normalization (RevIN) has shown promise by stationarizing inputs and denormalizing outputs, it relies on the strong assumption that historical and future distributions remain identical. We observe that in many practical applications, distribution shifts follow cyclical patterns that correlate with periodic positions (e.g., seasonal and holiday volatility). To this end, we propose PAMod, a lightweight yet powerful framework that models cyclical distribution shifts via Phase-Amplitude Modulation in the normalized feature space. PAMod learns periodic embeddings to modulate representations: phase modulation captures mean shifts, while amplitude modulation adapts to variance changes. Crucially, we prove mathematically that modulating in normalized space is equivalent to applying dynamic denormalization, offering an elegant unification of distribution adaptation and representation learning. Extensive experiments on twelve real-world benchmarks demonstrate that PAMod achieves state-of-the-art performance with fewer computational resources. Furthermore, our modulation mechanism, as a novel plug-and-play technique, can improve existing time-series forecasting methods with simple integration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper proposes PAMod, a lightweight framework for non-stationary time series forecasting. It models cyclical distribution shifts via phase-amplitude modulation applied to normalized features, where learned periodic embeddings capture mean shifts (phase) and variance changes (amplitude). The central claim is a mathematical proof that this modulation in normalized space is equivalent to dynamic denormalization, unifying distribution adaptation with representation learning. Experiments on twelve real-world benchmarks report state-of-the-art performance with reduced computational cost, and the modulation is presented as a plug-and-play module for existing forecasters.

Significance. If the equivalence holds and the cyclical-shift premise applies to the target domains, PAMod provides a clean unification of normalization-based adaptation and learned representations in a computationally efficient package. The plug-and-play design and multi-benchmark evaluation could make it a practical addition to the time-series forecasting toolkit, particularly for data exhibiting seasonal or periodic volatility patterns.

major comments (3)
  1. [Abstract] Abstract: the claim of a mathematical proof that 'modulating in normalized space is equivalent to applying dynamic denormalization' is asserted without any derivation steps, key equations, or section reference, leaving the unification of distribution adaptation and representation learning unverified and load-bearing for the central contribution.
  2. [Method] Method (periodic embeddings): the embeddings are learned parameters fitted to the training data; therefore the claimed equivalence reduces to a data-dependent identity rather than a parameter-free derivation, introducing circularity that must be resolved before the unification argument can be accepted.
  3. [Experiments] Experiments: the SOTA claims on twelve benchmarks rest on the untested premise that distribution shifts are cyclical and align with periodic positions; no ablation isolating aperiodic shifts, no error bars, and no statistical significance tests are described, so the performance gains cannot yet be attributed to the modulation mechanism rather than standard RevIN.
minor comments (1)
  1. [Abstract] Abstract: the phrase 'with fewer computational resources' is stated without specifying the exact metrics (FLOPs, wall-clock time, parameter count) or the precise baselines used for comparison.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below, providing clarifications on the mathematical foundations and committing to experimental enhancements where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of a mathematical proof that 'modulating in normalized space is equivalent to applying dynamic denormalization' is asserted without any derivation steps, key equations, or section reference, leaving the unification of distribution adaptation and representation learning unverified and load-bearing for the central contribution.

    Authors: The complete algebraic derivation, including the key equations establishing the equivalence between phase-amplitude modulation in normalized space and dynamic denormalization, appears in Section 3.2. This section also details how the approach unifies distribution adaptation with representation learning. We will revise the abstract to explicitly reference Section 3.2. revision: yes

  2. Referee: [Method] Method (periodic embeddings): the embeddings are learned parameters fitted to the training data; therefore the claimed equivalence reduces to a data-dependent identity rather than a parameter-free derivation, introducing circularity that must be resolved before the unification argument can be accepted.

    Authors: The equivalence is shown through a general algebraic identity that holds for arbitrary phase and amplitude values, independent of the data or the learning process used to obtain those values. The periodic embeddings are a practical mechanism to capture cyclical patterns, but the structural equivalence itself does not depend on specific parameter values or training data. We will insert a clarifying statement in the method section to emphasize the parameter-independent nature of the derivation. revision: partial

  3. Referee: [Experiments] Experiments: the SOTA claims on twelve benchmarks rest on the untested premise that distribution shifts are cyclical and align with periodic positions; no ablation isolating aperiodic shifts, no error bars, and no statistical significance tests are described, so the performance gains cannot yet be attributed to the modulation mechanism rather than standard RevIN.

    Authors: The cyclical-shift premise is motivated by domain observations and the consistent gains over RevIN baselines. To strengthen attribution, the revised version will include an ablation isolating aperiodic-shift cases, error bars computed over multiple random seeds, and statistical significance tests (e.g., paired t-tests) comparing PAMod against RevIN. We will also add a limitations paragraph discussing performance when shifts lack cyclical structure. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's central mathematical claim is a proof that modulating normalized representations via learned phase-amplitude parameters is equivalent to dynamic denormalization. This is a direct consequence of the reversible normalization definitions (normalization followed by modulated denormalization) rather than a reduction of the result to its own fitted inputs by construction. Periodic embeddings are indeed learned from data, but the equivalence itself is parameter-free once the modulation operators are defined and does not presuppose the target performance or cyclical-shift assumption. No self-citation chains, fitted-input predictions, or ansatz smuggling appear in the load-bearing steps; empirical SOTA results on benchmarks remain external to the derivation. The cyclical-shift premise is an explicit modeling assumption, not a hidden tautology.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach depends on learned periodic embeddings as free parameters and the domain assumption that shifts are cyclical; the mathematical equivalence is asserted without visible derivation in the provided text.

free parameters (1)
  • periodic embeddings
    Learned embeddings that control phase and amplitude modulation, fitted from training data to capture cyclical patterns.
axioms (1)
  • domain assumption Distribution shifts follow cyclical patterns correlated with periodic positions
    Stated as the key observation enabling the modulation to outperform static normalization.
invented entities (1)
  • Phase-Amplitude Modulation no independent evidence
    purpose: To adapt normalized representations for mean and variance shifts in a unified way
    Newly introduced mechanism whose only support is the claimed equivalence and empirical results.

pith-pipeline@v0.9.0 · 5517 in / 1325 out tokens · 38018 ms · 2026-05-09T19:21:10.804234+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references

  1. [1]

    Proceedings of the National Academy of Sciences , volume=

    The challenges of modeling and forecasting the spread of COVID-19 , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=

  2. [2]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Yihe Wang and Yu Han and Haishuai Wang and Xiang Zhang , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  3. [3]

    Financial time series forecasting with deep learning :

    Omer Berat Sezer and Mehmet Ugur Gudelek and Ahmet Murat. Financial time series forecasting with deep learning :. Appl. Soft Comput. , volume =

  4. [4]

    Companion Proceedings of the

    Sida Lin and Yankai Chen and Yiyan Qi and Chenhao Ma and Bokai Cao and Yifei Zhang and Xue Liu and Jian Guo , title =. Companion Proceedings of the

  5. [5]

    energy , volume=

    Grey prediction with rolling mechanism for electricity demand forecasting of Turkey , author=. energy , volume=. 2007 , publisher=

  6. [6]

    Wei Fan and Yanjie Fu and Shun Zheng and Jiang Bian and Yuanchun Zhou and Hui Xiong , title =

  7. [7]

    Wanneng Shu and Ken Cai and Neal Naixue Xiong , title =

  8. [8]

    Jensen , title =

    Hao Miao and Yan Zhao and Chenjuan Guo and Bin Yang and Kai Zheng and Feiteng Huang and Jiandong Xie and Christian S. Jensen , title =. 40th

  9. [9]

    Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift , booktitle =

    Taesung Kim and Jinhee Kim and Yunwon Tae and Cheonbok Park and Jang. Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift , booktitle =

  10. [10]

    TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting , booktitle =

    Peiyuan Liu and Beiliang Wu and Yifan Hu and Naiqi Li and Tao Dai and Jigang Bao and Shu. TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting , booktitle =

  11. [11]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Yong Liu and Haixu Wu and Jianmin Wang and Mingsheng Long , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  12. [12]

    Koopman Neural Operator Forecaster for Time-series with Temporal Distributional Shifts , booktitle =

    Rui Wang and Yihe Dong and Sercan. Koopman Neural Operator Forecaster for Time-series with Temporal Distributional Shifts , booktitle =

  13. [13]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Zhiding Liu and Mingyue Cheng and Zhi Li and Zhenya Huang and Qi Liu and Yanhu Xie and Enhong Chen , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  14. [14]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Tao Dai and Beiliang Wu and Peiyuan Liu and Naiqi Li and Xue Yuerong and Shu. Advances in Neural Information Processing Systems (NeurIPS) , year =

  15. [15]

    Proceedings of the Institute of Radio Engineers , volume=

    Amplitude, phase, and frequency modulation , author=. Proceedings of the Institute of Radio Engineers , volume=. 2006 , publisher=

  16. [16]

    Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , booktitle =

    Guokun Lai and Wei. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , booktitle =

  17. [17]

    arXiv , volume =

    Shengsheng Lin and Weiwei Lin and Wentai Wu and Feiyu Zhao and Ruichao Mo and Haotong Zhang , title =. arXiv , volume =

  18. [18]

    The Twelfth International Conference on Learning Representations (ICLR) , year =

    Donghao Luo and Xue Wang , title =. The Twelfth International Conference on Learning Representations (ICLR) , year =

  19. [19]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Minhao Liu and Ailing Zeng and Muxi Chen and Zhijian Xu and Qiuxia Lai and Lingna Ma and Qiang Xu , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  20. [20]

    Thirty-Fifth

    Haoyi Zhou and Shanghang Zhang and Jieqi Peng and Shuai Zhang and Jianxin Li and Hui Xiong and Wancai Zhang , title =. Thirty-Fifth

  21. [21]

    Advances in Neural Information Processing Systems (NeurIPS) , pages =

    Haixu Wu and Jiehui Xu and Jianmin Wang and Mingsheng Long , title =. Advances in Neural Information Processing Systems (NeurIPS) , pages =

  22. [22]

    International Conference on Machine Learning (ICML) , series =

    Tian Zhou and Ziqing Ma and Qingsong Wen and Xue Wang and Liang Sun and Rong Jin , title =. International Conference on Machine Learning (ICML) , series =

  23. [23]

    Proceedings of the 30th

    Xihao Piao and Zheng Chen and Taichi Murayama and Yasuko Matsubara and Yasushi Sakurai , title =. Proceedings of the 30th

  24. [24]

    Thirty-Seventh

    Ailing Zeng and Muxi Chen and Lei Zhang and Qiang Xu , title =. Thirty-Seventh

  25. [25]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Kun Yi and Qi Zhang and Wei Fan and Shoujin Wang and Pengyang Wang and Hui He and Ning An and Defu Lian and Longbing Cao and Zhendong Niu , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  26. [26]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Kun Yi and Qi Zhang and Wei Fan and Hui He and Liang Hu and Pengyang Wang and Ning An and Longbing Cao and Zhendong Niu , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  27. [27]

    Thirty-Eighth

    Wanlin Cai and Yuxuan Liang and Xianggen Liu and Jianshuai Feng and Yuankai Wu , title =. Thirty-Eighth

  28. [28]

    arXiv , volume =

    Albert Gu and Tri Dao , title =. arXiv , volume =

  29. [29]

    2015 , publisher=

    Time series analysis: forecasting and control , author=. 2015 , publisher=

  30. [30]

    The Eleventh International Conference on Learning Representations (ICLR) , year =

    Haixu Wu and Tengge Hu and Yong Liu and Hang Zhou and Jianmin Wang and Mingsheng Long , title =. The Eleventh International Conference on Learning Representations (ICLR) , year =

  31. [31]

    Zhang and Jun Zhou , title =

    Shiyu Wang and Haixu Wu and Xiaoming Shi and Tengge Hu and Huakun Luo and Lintao Ma and James Y. Zhang and Jun Zhou , title =. The Twelfth International Conference on Learning Representations (ICLR) , year =

  32. [32]

    STL: A seasonal-trend decomposition , author=. J. off. Stat , volume=

  33. [33]

    Periodicity Decoupling Framework for Long-term Series Forecasting , booktitle =

    Tao Dai and Beiliang Wu and Peiyuan Liu and Naiqi Li and Jigang Bao and Yong Jiang and Shu. Periodicity Decoupling Framework for Long-term Series Forecasting , booktitle =

  34. [34]

    Forty-first International Conference on Machine Learning (ICML) , year =

    Shengsheng Lin and Weiwei Lin and Wentai Wu and Haojun Chen and Junjie Yang , title =. Forty-first International Conference on Machine Learning (ICML) , year =

  35. [35]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Shengsheng Lin and Weiwei Lin and Xinyi Hu and Wentai Wu and Ruichao Mo and Haocheng Zhong , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  36. [36]

    Forty-second International Conference on Machine Learning (ICML) , year =

    Shengsheng Lin and Haojun Chen and Haijie Wu and Chunyun Qiu and Weiwei Lin , title =. Forty-second International Conference on Machine Learning (ICML) , year =

  37. [37]

    2007 , publisher=

    Time series analysis , author=. 2007 , publisher=

  38. [38]

    Journal of the Institution of Electrical Engineers-Part III: Communication Engineering , volume=

    Modulation theory , author=. Journal of the Institution of Electrical Engineers-Part III: Communication Engineering , volume=. 1944 , publisher=

  39. [39]

    arXiv , volume =

    Mingyuan Xia and Chunxu Zhang and Zijian Zhang and Hao Miao and Qidong Liu and Yuanshao Zhu and Bo Yang , title =. arXiv , volume =

  40. [40]

    The Association for the Advancement of Artificial Intelligence (AAAI) , pages =

    Yulong Wang and Yushuo Liu and Xiaoyi Duan and Kai Wang , title =. The Association for the Advancement of Artificial Intelligence (AAAI) , pages =

  41. [41]

    The Association for the Advancement of Artificial Intelligence (AAAI) , pages =

    Jingru Fei and Kun Yi and Wei Fan and Qi Zhang and Zhendong Niu , title =. The Association for the Advancement of Artificial Intelligence (AAAI) , pages =

  42. [42]

    Advances in Neural Information Processing Systems (NeurIPS) , year =

    Yuxuan Wang and Haixu Wu and Jiaxiang Dong and Guo Qin and Haoran Zhang and Yong Liu and Yunzhong Qiu and Jianmin Wang and Mingsheng Long , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

  43. [43]

    The Twelfth International Conference on Learning Representations (ICLR) , year =

    Yong Liu and Tengge Hu and Haoran Zhang and Haixu Wu and Shiyu Wang and Lintao Ma and Mingsheng Long , title =. The Twelfth International Conference on Learning Representations (ICLR) , year =

  44. [44]

    Nguyen and Phanwadee Sinthong and Jayant Kalagnanam , title =

    Yuqi Nie and Nam H. Nguyen and Phanwadee Sinthong and Jayant Kalagnanam , title =. The Eleventh International Conference on Learning Representations (ICLR) , year =

  45. [45]

    , title =

    Adam Paszke and Sam Gross and Francisco Massa and Adam Lerer and James Bradbury and Gregory Chanan and Trevor Killeen and et al. , title =. Advances in Neural Information Processing Systems (NeurIPS) , pages =

  46. [46]

    Kingma and Jimmy Ba , title =

    Diederik P. Kingma and Jimmy Ba , title =. 3rd International Conference on Learning Representations (ICLR) , year =

  47. [47]

    The Thirteenth International Conference on Learning Representations (ICLR) , year =

    Hao Wang and Lichen Pan and Yuan Shen and Zhichao Chen and Degui Yang and Yifei Yang and Sen Zhang and Xinggao Liu and Haoxuan Li and Dacheng Tao , title =. The Thirteenth International Conference on Learning Representations (ICLR) , year =

  48. [48]

    Journal of machine learning research , volume=

    Visualizing data using t-SNE , author=. Journal of machine learning research , volume=

  49. [49]

    The 22nd Conference on Learning Theory (COLT) , year =

    Yishay Mansour and Mehryar Mohri and Afshin Rostamizadeh , title =. The 22nd Conference on Learning Theory (COLT) , year =

  50. [50]

    A theory of learning from different domains , journal =

    Shai Ben. A theory of learning from different domains , journal =

  51. [51]

    2026 , booktitle =

    PAMNet: Cycle-aware Phase-Amplitude Modulation Network for Multivariate Time Series Forecasting , author =. 2026 , booktitle =