pith. machine review for the scientific record. sign in

arxiv: 2605.07222 · v1 · submitted 2026-05-08 · 💻 cs.LG

Recognition: no theorem link

Don't Learn the Shape: Forecasting Periodic Time Series by Rank-1 Decomposition

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:18 UTC · model grok-4.3

classification 💻 cs.LG
keywords periodic time seriesforecastingrank-1 decompositionfrozen baselineFLAIRGIFT-Evalsimple modelsshape estimation
0
0 comments X

The pith

Periodic time series are approximately rank-1, so averaging recent cycles forecasts as well as any learned shape.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

When an hourly series is reshaped into a matrix with one row per time-of-day and one column per day, the data matrix is close to rank-1: a fixed daily pattern scaled by a changing daily level. The paper shows that this structure makes it unnecessary to estimate or smooth the shape vector. On 97 standard test configurations the simple average of the last two full cycles performs at least as well as Fourier smoothing, exponential weighting, James-Stein shrinkage, or low-rank SVD fits; none of the eight alternatives improves accuracy significantly after multiple-testing correction, and two are worse. The resulting procedure, FLAIR, therefore uses only a handful of scalars, runs in closed form on a CPU, and reaches the same aggregate accuracy as a much larger neural forecaster.

Core claim

Periodic time series reshaped as a period-by-cycle matrix are approximately rank-1 (median centered rank-1 energy 0.82). In this regime the shape can be frozen from the average of the most recent cycles while only the scalar level per cycle is adjusted. Across all 97 GIFT-Eval configurations, eight more flexible estimators (Fourier, EWMA, James-Stein, rank-r SVD, etc.) produce no statistically significant improvement over this frozen baseline under Holm correction, and two are significantly worse. Extra flexibility therefore functions as estimation noise rather than signal.

What carries the argument

Rank-1 decomposition of the reshaped period matrix with the shape vector held fixed at the average of the last K cycles.

If this is right

  • The method matches the aggregate relMASE of PatchTST (0.838 versus 0.849) on GIFT-Eval.
  • Hourly models require only 28 stored scalars and weekly models 57 scalars.
  • A single CPU core finishes the entire benchmark in 22 minutes with no GPU or pre-training.
  • The procedure is fully closed-form: one SVD per candidate period length plus GCV-tuned ridge regression, with no per-task hyper-parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Strongly periodic series with many observed cycles may systematically favor parsimonious models because added flexibility increases variance more than it reduces bias.
  • The same rank-1 view could be applied to other regular periodic signals such as traffic counts or retail sales to decide whether shape estimation is worth the cost.
  • Datasets whose reshaped matrices show substantially lower rank-1 energy would be natural places to test when learning the shape becomes beneficial.

Load-bearing premise

The 97 GIFT-Eval configurations and the observed median rank-1 energy of 0.82 represent the periodic forecasting tasks where the method would be applied.

What would settle it

A periodic dataset or additional GIFT-Eval configuration in which any of the eight tested alternatives (Fourier, EWMA, James-Stein, rank-r SVD, etc.) produces a statistically significant accuracy gain over the frozen two-cycle average after Holm correction.

Figures

Figures reproduced from arXiv: 2605.07222 by Takato Honda.

Figure 1
Figure 1. Figure 1: Don’t learn the shape; learn the level. FLAIR factorizes a periodic series y as yˆh = Lˆ ⌈h/P ⌉ · Sh mod P , with the Shape S frozen as the last-K=2 within-period proportion average and the Level Lˆ forecast by GCV-averaged Ridge. Hourly P=24 fits in 28 scalars per series; aggregate relMASE on 97 GIFT-Eval configurations is 0.838, ties PatchTST. Each row: (a) raw series with 48-step forecast (red); (b) res… view at source ↗
Figure 2
Figure 2. Figure 2: Don’t learn the shape: 8 Shape-learning variants sorted by per-config [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Left: FLAIR is Pareto-optimal at the small-model frontier on GIFT-Eval (97 con￾figurations, geometric-mean relMASE vs. parameter count; foundation models lead aggregate but at 106 -108× FLAIR’s parameter count). The parameter axis mixes units: trained weights for deep-learning and foundation models, per-series estimate counts (P+p+1 for FLAIR, ∼5 for Au￾toARIMA/AutoTheta/ETS) for the classical baselines; m… view at source ↗
Figure 4
Figure 4. Figure 4: FLAIR Pareto-dominates the reported CPU statistical baselines (AutoTheta, Prophet, AutoETS, AutoARIMA) on accuracy and wall-clock, at 0.9 s/config and P+p+1 scalars per series. X-axis: median per-config wall-clock; FLAIR/Prophet measured on a 2024 Mac￾Book Pro (M4 Pro) one core; AutoTheta/AutoETS/AutoARIMA estimated from statsforecast per-series fit times at length buckets {200, 700, 2000}, scaled by nseri… view at source ↗
Figure 5
Figure 5. Figure 5: FLAIR operating regime over 93 GIFT-Eval configurations with centered r1 and Chronos￾Bolt-Base scores. Color: log10 of the per-config relMASE ratio, FLAIR over Chronos-Bolt-Base (blue = FLAIR wins, red = Chronos-Bolt wins). Green zone: frozen-shape regime, drawn here at the lenient phase-diagram threshold r1 ≥ 0.6, nc ≥ 10 for visualization (the canonical routing rule used in Section 7 is the tighter r1 ≥ … view at source ↗
Figure 6
Figure 6. Figure 6: Distribution of the shifted-positive r1 (the quantity FLAIR’s SVD operates on) across all 289,433 GIFT-Eval series (46 configurations): per-config median 0.99, with 78% of configs exceed￾ing 0.9. The centered r1, which does not double-count the rank-1 contribution of the positivity shift, has a lower per-config median of 0.82 (Section 2). The joint (r1, nc) picture (the operating regime that actually matte… view at source ↗
read the original abstract

How few parameters do we really need to forecast a periodic time series? An hourly electricity series, reshaped as a 24-row matrix with one column per day, is approximately rank-1: a daily shape modulated by a daily level (median centered rank-1 energy 0.82 on GIFT-Eval). Should we learn the shape? Smoothing, shrinkage, and low-rank fits all seem like obvious upgrades over the simple average of the last K=2 cycles. On all 97 GIFT-Eval configurations, we tested 8 such alternatives (e.g., Fourier, EWMA, James-Stein, rank-r SVD): none significantly beats the frozen baseline under Holm correction; two are significantly worse. The resulting method, FLAIR, is (a) Effective: matches PatchTST on aggregate GIFT-Eval (relMASE 0.838 vs 0.849); (b) Compact: 28 scalars for hourly, 57 for weekly; (c) Fast: 22 minutes on one CPU core of a MacBook Pro; (d) Closed-form & Hands-Off: one SVD per period candidate, GCV-averaged Ridge, no GPU, no pre-training, no per-task tuning. In the high-rank-1, many-cycle regime, extra flexibility is estimation noise.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript claims that periodic time series are often approximately rank-1 (median centered rank-1 energy 0.82 on GIFT-Eval), so that a simple frozen baseline averaging the last K=2 cycles suffices for forecasting. Across all 97 GIFT-Eval configurations, eight more flexible alternatives (Fourier, EWMA, James-Stein, rank-r SVD, etc.) yield no statistically significant improvement over this baseline under Holm correction, with two alternatives significantly worse. The resulting FLAIR method matches PatchTST aggregate performance (relMASE 0.838 vs. 0.849), uses only 28–57 scalars, runs in closed form on CPU, and requires no pre-training or per-task tuning. The authors conclude that, in the high-rank-1 many-cycle regime, extra flexibility is estimation noise.

Significance. If the empirical comparisons hold, the work provides concrete evidence that model complexity can be detrimental for periodic series with strong rank-1 structure, favoring compact closed-form methods over learned shapes. Strengths include the public benchmark, multiple-comparison correction, fully reproducible closed-form procedure, and explicit parameter count. The result could shift practice toward lightweight baselines in domains such as energy and traffic forecasting where many cycles are available.

major comments (1)
  1. [Experimental results on GIFT-Eval] The central regime-specific claim—that extra flexibility is estimation noise precisely in the high-rank-1, many-cycle regime—rests on aggregate results over the 97 GIFT-Eval configurations (median energy 0.82). No breakdown of performance gaps by per-task rank-1 energy or cycle count is reported, so it is unclear whether the “none significantly beats” finding is driven by the high-rank-1 subset or diluted by lower-energy tasks. This analysis is load-bearing for the generalization stated in the abstract and conclusion.
minor comments (2)
  1. [§3] The exact definition and centering procedure for “centered rank-1 energy” should be stated explicitly (currently referenced only by the median value 0.82).
  2. [Results section] Table or figure reporting the per-alternative p-values and Holm-adjusted thresholds would make the “none significantly beats” statement immediately verifiable.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the strengths of our work, including the public benchmark, multiple-comparison correction, and closed-form procedure. We address the major comment below.

read point-by-point responses
  1. Referee: [Experimental results on GIFT-Eval] The central regime-specific claim—that extra flexibility is estimation noise precisely in the high-rank-1, many-cycle regime—rests on aggregate results over the 97 GIFT-Eval configurations (median energy 0.82). No breakdown of performance gaps by per-task rank-1 energy or cycle count is reported, so it is unclear whether the “none significantly beats” finding is driven by the high-rank-1 subset or diluted by lower-energy tasks. This analysis is load-bearing for the generalization stated in the abstract and conclusion.

    Authors: We agree that the lack of a stratified breakdown by per-task rank-1 energy and cycle count weakens the support for the regime-specific claim in the abstract and conclusion. Although the median rank-1 energy of 0.82 indicates that most configurations lie in the high-rank-1 regime, aggregate statistics alone cannot confirm that the absence of significant gains from flexible methods is concentrated in that subset rather than diluted by lower-energy tasks. In the revised manuscript we will add the requested analysis: subgroup results and statistical tests restricted to tasks with rank-1 energy above the median (and above 0.75), together with breakdowns by number of available cycles. We will also include scatter plots of relMASE difference versus rank-1 energy and versus cycle count. These additions will make the generalization explicit and address the load-bearing concern. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims rest on external benchmark

full rationale

The paper advances no mathematical derivation chain that reduces to its own inputs by construction. Its core claim—that none of 8 alternatives (Fourier, EWMA, James-Stein, rank-r SVD, etc.) significantly outperforms the frozen K=2 baseline on 97 GIFT-Eval configurations under Holm correction—is a direct empirical result on an external benchmark, not a fitted parameter renamed as prediction or a self-definitional loop. The reported median centered rank-1 energy of 0.82 is a descriptive statistic computed from the data, not used to define the method or force the conclusion. FLAIR itself is presented as a closed-form procedure (one SVD per period candidate + GCV-averaged Ridge) whose performance is validated by comparison to PatchTST rather than by internal tautology. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling appear in the provided text. The skeptic concern about GIFT-Eval representativeness is a question of external validity, not circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The recommendation against learning the shape rests on the domain assumption that the reshaped matrix is approximately rank-1 and that additional estimation parameters introduce more noise than signal in the many-cycle regime.

free parameters (1)
  • K (number of cycles for baseline)
    Chosen as 2 for the frozen baseline; not data-fitted per task.
axioms (1)
  • domain assumption Periodic time series, when reshaped into a period-by-cycle matrix, is approximately rank-1
    Empirically supported by median centered rank-1 energy of 0.82 on GIFT-Eval

pith-pipeline@v0.9.0 · 5530 in / 1335 out tokens · 53590 ms · 2026-05-11T02:18:15.366357+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

95 extracted references · 95 canonical work pages · 2 internal anchors

  1. [1]

    Andrews, Donald W. K. , journal=. Asymptotic Optimality of Generalized

  2. [2]

    1996 , publisher=

    Local Polynomial Modelling and Its Applications , author=. 1996 , publisher=

  3. [3]

    2009 , publisher=

    Introduction to Nonparametric Estimation , author=. 2009 , publisher=

  4. [4]

    Transactions on Machine Learning Research , year=

    Chronos: Learning the Language of Time Series , author=. Transactions on Machine Learning Research , year=

  5. [6]

    Ansari, Abdul Fatir and others , year=

  6. [7]

    International Conference on Machine Learning , year=

    Unified Training of Universal Time Series Forecasting Transformers , author=. International Conference on Machine Learning , year=

  7. [8]

    International Conference on Machine Learning , year=

    A Decoder-Only Foundation Model for Time-Series Forecasting , author=. International Conference on Machine Learning , year=

  8. [9]

    2024 , note=

    Aksu, Taha and Woo, Gerald and Liu, Juncheng and Liu, Xu and Liu, Chenghao and Savarese, Silvio and Xiong, Caiming and Sahoo, Doyen , booktitle=. 2024 , note=

  9. [10]

    2025 , howpublished=

  10. [11]

    Oreshkin, Boris N and Carpov, Dmitri and Chapados, Nicolas and Bengio, Yoshua , booktitle=

  11. [12]

    Salinas, David and Flunkert, Valentin and Gasthaus, Jan and Januschowski, Tim , journal=

  12. [13]

    International Conference on Learning Representations , year=

    A Time Series is Worth 64 Words: Long-term Forecasting with Transformers , author=. International Conference on Learning Representations , year=

  13. [14]

    AAAI Conference on Artificial Intelligence , year=

    Are Transformers Effective for Time Series Forecasting? , author=. AAAI Conference on Artificial Intelligence , year=

  14. [15]

    Lin, Shengsheng and Lin, Weiwei and Wu, Wentai and Chen, Haojun and Yang, Junjie , booktitle=

  15. [16]

    The American Statistician , volume=

    Forecasting at Scale , author=. The American Statistician , volume=

  16. [17]

    International Journal of Forecasting , volume=

    The theta model: a decomposition approach to forecasting , author=. International Journal of Forecasting , volume=

  17. [18]

    Liu, Yong and Hu, Tengge and Zhang, Haoran and Wu, Haixu and Wang, Shiyu and Ma, Lintao and Long, Mingsheng , booktitle=

  18. [19]

    Liu, Yong and Li, Chenyu and Wang, Jianmin and Long, Mingsheng , booktitle=

  19. [20]

    Dagum, Estela Bee , year=. The

  20. [21]

    Technometrics , volume=

    Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter , author=. Technometrics , volume=

  21. [22]

    Golyandina, Nina and Nekrutkin, Vladimir and Zhigljavsky, Anatoly A. , year=. Analysis of Time Series Structure:

  22. [23]

    Physica D: Nonlinear Phenomena , volume=

    Extracting qualitative dynamics from experimental data , author=. Physica D: Nonlinear Phenomena , volume=. 1986 , publisher=

  23. [24]

    Physica D: Nonlinear Phenomena , volume=

    Singular-spectrum analysis: A toolkit for short, noisy chaotic signals , author=. Physica D: Nonlinear Phenomena , volume=. 1992 , publisher=

  24. [25]

    The Annals of Statistics , volume=

    Admissible and Minimax Estimation for the Multinomial Distribution and for k Independent Binomial Distributions , author=. The Annals of Statistics , volume=

  25. [26]

    Sankhy\=

    Completeness, Similar Regions, and Unbiased Estimation , author=. Sankhy\=

  26. [27]

    The Annals of Probability , volume=

    Phase Transition of the Largest Eigenvalue for Nonnull Complex Sample Covariance Matrices , author=. The Annals of Probability , volume=

  27. [28]

    The Optimal Hard Threshold for Singular Values is 4/

    Gavish, Matan and Donoho, David L , journal=. The Optimal Hard Threshold for Singular Values is 4/

  28. [29]

    Journal of Multivariate Analysis , volume=

    The Singular Values and Vectors of Low Rank Perturbations of Large Rectangular Random Matrices , author=. Journal of Multivariate Analysis , volume=

  29. [30]

    Probability Theory and Related Fields , volume=

    Fundamental Limits of Symmetric Low-Rank Matrix Estimation , author=. Probability Theory and Related Fields , volume=

  30. [31]

    The Annals of Statistics , volume=

    Singular Vector and Singular Subspace Distribution for the Matrix Denoising Model , author=. The Annals of Statistics , volume=

  31. [32]

    2003 , publisher=

    Nonlinear Time Series: Nonparametric and Parametric Methods , author=. 2003 , publisher=

  32. [33]

    Management Science , volume=

    Forecasting Sales by Exponentially Weighted Moving Averages , author=. Management Science , volume=

  33. [34]

    Cleveland, Robert B and Cleveland, William S and McRae, Jean E and Terpenning, Irma , journal=

  34. [35]

    and Bergmeir, Christoph , journal=

    Bandara, Kasun and Hyndman, Rob J. and Bergmeir, Christoph , journal=

  35. [36]

    Journal of the American Statistical Association , volume=

    Forecasting Time Series with Complex Seasonal Patterns Using Exponential Smoothing , author=. Journal of the American Statistical Association , volume=

  36. [37]

    The Elements of Statistical Learning: Data Mining, Inference, and Prediction , author=

  37. [38]

    Wu, Haixu and Hu, Tengge and Liu, Yong and Zhou, Hang and Wang, Jianmin and Long, Mingsheng , booktitle=

  38. [39]

    One Fits All: Power General Time Series Analysis by Pretrained

    Zhou, Tian and Niu, Peisong and Wang, Xue and Sun, Liang and Jin, Rong , booktitle=. One Fits All: Power General Time Series Analysis by Pretrained

  39. [40]

    Ekambaram, Vijay and Jati, Arindam and Dayama, Pankaj and Mukherjee, Sumanta and Nguyen, Nam H and Gifford, Wesley M and Reddy, Chandra and Kalagnanam, Jayant , booktitle=

  40. [41]

    Jin, Ming and Wang, Shiyu and Ma, Lintao and Chu, Zhixuan and Zhang, James Y and Shi, Xiaoming and Chen, Pin-Yu and Liang, Yuxuan and Li, Yuan-Fang and Pan, Shirui and Wen, Qingsong , booktitle=

  41. [42]

    1989 , publisher=

    Forecasting, Structural Time Series Models and the Kalman Filter , author=. 1989 , publisher=

  42. [43]

    2008 , publisher=

    Forecasting with Exponential Smoothing: The State Space Approach , author=. 2008 , publisher=

  43. [44]

    Operational Research Quarterly , volume=

    Forecasting and stock control for intermittent demands , author=. Operational Research Quarterly , volume=

  44. [45]

    Journal of the American Statistical Association , volume=

    Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization , author=. Journal of the American Statistical Association , volume=

  45. [46]

    2018 , publisher=

    High-Dimensional Probability: An Introduction with Applications in Data Science , author=. 2018 , publisher=

  46. [47]

    Garza, Federico and Canseco, Max Mergenthaler and Challu, Cristian and Olivares, Kin G. , year=

  47. [48]

    and Musgrave, John C

    Shiskin, Julius and Young, Allan H. and Musgrave, John C. , institution=. The

  48. [49]

    and Monsell, Brian C

    Findley, David F. and Monsell, Brian C. and Bell, William R. and Otto, Mark C. and Chen, Bor-Chung , journal=. New capabilities and methods of the

  49. [50]

    Seasonal adjustment by

    Sax, Christoph and Eddelbuettel, Dirk , journal=. Seasonal adjustment by

  50. [51]

    Concentration Inequalities: A Nonasymptotic Theory of Independence , author=

  51. [52]

    Gift-eval: A benchmark for general time series forecasting model evaluation.arXiv preprint arXiv:2410.10393,

    Taha Aksu, Gerald Woo, Juncheng Liu, Xu Liu, Chenghao Liu, Silvio Savarese, Caiming Xiong, and Doyen Sahoo. GIFT-Eval : A benchmark for general time series forecasting model evaluation. In NeurIPS Workshop on Time Series in the Age of Large Models (TSALM), 2024. arXiv:2410.10393

  52. [53]

    Donald W. K. Andrews. Asymptotic optimality of generalized C_L , cross-validation, and generalized cross-validation in regression with heteroskedastic errors. Journal of Econometrics, 47 0 (2--3): 0 359--377, 1991

  53. [54]

    Chronos: Learning the language of time series

    Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Sez de Oc \'a riz Borde, Christos Faloutsos, Oleksandr Shchur, Jan Gasthaus, Michael Bohlke-Schneider, Yuyang Wang, and Syama Sundar Rangapuram. Chronos: Learning the language of time series. Transactions on Machine Learning Research, 2024

  54. [55]

    Chronos-2: From Univariate to Universal Forecasting

    Abdul Fatir Ansari et al. Chronos-2 : From univariate to universal forecasting. https://arxiv.org/abs/2510.15821, 2025. arXiv:2510.15821

  55. [56]

    The theta model: a decomposition approach to forecasting

    Vassilis Assimakopoulos and Konstantinos Nikolopoulos. The theta model: a decomposition approach to forecasting. International Journal of Forecasting, 16 0 (4): 0 521--530, 2000

  56. [57]

    FEV : Forecasting evaluation

    AutoGluon Team . FEV : Forecasting evaluation. https://github.com/autogluon/fev, 2025

  57. [58]

    Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices

    Jinho Baik, G\' e rard Ben Arous, and Sandrine P\' e ch\' e . Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. The Annals of Probability, 33 0 (5): 0 1643--1697, 2005

  58. [59]

    Hyndman, and Christoph Bergmeir

    Kasun Bandara, Rob J. Hyndman, and Christoph Bergmeir. MSTL : A seasonal-trend decomposition algorithm for time series with multiple seasonal patterns. International Journal of Operational Research, 52 0 (1): 0 79--98, 2025

  59. [60]

    Singular vector and singular subspace distribution for the matrix denoising model

    Zhigang Bao, Xiucai Ding, and Ke Wang. Singular vector and singular subspace distribution for the matrix denoising model. The Annals of Statistics, 49 0 (1): 0 370--392, 2021

  60. [61]

    The singular values and vectors of low rank perturbations of large rectangular random matrices

    Florent Benaych-Georges and Raj Rao Nadakuditi. The singular values and vectors of low rank perturbations of large rectangular random matrices. Journal of Multivariate Analysis, 111: 0 120--135, 2012

  61. [62]

    Concentration Inequalities: A Nonasymptotic Theory of Independence

    St \'e phane Boucheron, G \'a bor Lugosi, and Pascal Massart. Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013

  62. [63]

    Broomhead and Gregory P

    David S. Broomhead and Gregory P. King. Extracting qualitative dynamics from experimental data. Physica D: Nonlinear Phenomena, 20 0 (2--3): 0 217--236, 1986

  63. [64]

    STL : A seasonal-trend decomposition procedure based on loess

    Robert B Cleveland, William S Cleveland, Jean E McRae, and Irma Terpenning. STL : A seasonal-trend decomposition procedure based on loess. Journal of Official Statistics, 6 0 (1): 0 3--73, 1990

  64. [65]

    John D. Croston. Forecasting and stock control for intermittent demands. Operational Research Quarterly, 23 0 (3): 0 289--303, 1972

  65. [66]

    The X-11-ARIMA/88 Seasonal Adjustment Method: Foundations and User's Manual

    Estela Bee Dagum. The X-11-ARIMA/88 Seasonal Adjustment Method: Foundations and User's Manual . Statistics Canada, Time Series Research and Analysis Division, 1988

  66. [67]

    A decoder-only foundation model for time-series forecasting

    Abhimanyu Das, Weihao Kong, Andrew Leber, Rajat Mathews, and Renjie Sen. A decoder-only foundation model for time-series forecasting. In International Conference on Machine Learning, 2024

  67. [68]

    Forecasting time series with complex seasonal patterns using exponential smoothing

    Alysha M De Livera, Rob J Hyndman, and Ralph D Snyder. Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American Statistical Association, 106 0 (496): 0 1513--1527, 2011

  68. [69]

    TTM s: Fast multi-level tiny time mixers for improved zero-shot and few-shot forecasting of multivariate time series

    Vijay Ekambaram, Arindam Jati, Pankaj Dayama, Sumanta Mukherjee, Nam H Nguyen, Wesley M Gifford, Chandra Reddy, and Jayant Kalagnanam. TTM s: Fast multi-level tiny time mixers for improved zero-shot and few-shot forecasting of multivariate time series. In Advances in Neural Information Processing Systems, 2024

  69. [70]

    Local Polynomial Modelling and Its Applications

    Jianqing Fan and Irene Gijbels. Local Polynomial Modelling and Its Applications. Chapman and Hall, London, 1996

  70. [71]

    Findley, Brian C

    David F. Findley, Brian C. Monsell, William R. Bell, Mark C. Otto, and Bor-Chung Chen. New capabilities and methods of the X-12-ARIMA seasonal-adjustment program. Journal of Business & Economic Statistics, 16 0 (2): 0 127--152, 1998

  71. [72]

    Olivares

    Federico Garza, Max Mergenthaler Canseco, Cristian Challu, and Kin G. Olivares. StatsForecast : Lightning fast forecasting with statistical and econometric models. https://github.com/Nixtla/statsforecast, 2022

  72. [73]

    The optimal hard threshold for singular values is 4/ 3

    Matan Gavish and David L Donoho. The optimal hard threshold for singular values is 4/ 3 . IEEE Transactions on Information Theory, 60 0 (8): 0 5040--5053, 2014

  73. [74]

    Generalized cross-validation as a method for choosing a good ridge parameter

    Gene H Golub, Michael Heath, and Grace Wahba. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21 0 (2): 0 215--223, 1979

  74. [75]

    Zhigljavsky

    Nina Golyandina, Vladimir Nekrutkin, and Anatoly A. Zhigljavsky. Analysis of Time Series Structure: SSA and Related Techniques . Chapman and Hall/CRC, 2001

  75. [76]

    Andrew C. Harvey. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press, 1989

  76. [77]

    The Elements of Statistical Learning: Data Mining, Inference, and Prediction

    Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2 edition, 2009

  77. [78]

    Hyndman, Anne B

    Rob J. Hyndman, Anne B. Koehler, J. Keith Ord, and Ralph D. Snyder. Forecasting with Exponential Smoothing: The State Space Approach. Springer, 2008

  78. [79]

    Completeness, similar regions, and unbiased estimation

    Erich L Lehmann and Henry Scheff\' e . Completeness, similar regions, and unbiased estimation. Sankhy\= a : The Indian Journal of Statistics , 10 0 (4): 0 305--340, 1950

  79. [80]

    Fundamental limits of symmetric low-rank matrix estimation

    Marc Lelarge and L \'e o Miolane. Fundamental limits of symmetric low-rank matrix estimation. Probability Theory and Related Fields, 173: 0 859--929, 2019

  80. [81]

    SparseTSF : Modeling long-term time series forecasting with 1k parameters

    Shengsheng Lin, Weiwei Lin, Wentai Wu, Haojun Chen, and Junjie Yang. SparseTSF : Modeling long-term time series forecasting with 1k parameters. In International Conference on Machine Learning, 2024

Showing first 80 references.