Training-Free Probabilistic Time-Series Forecasting with Conformal Seasonal Pools
Pith reviewed 2026-05-09 15:35 UTC · model grok-4.3
The pith
A training-free method using conformal seasonal pools outperforms deep learning forecasters on calibration and speed with no learned parameters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Conformal Seasonal Pools (CSP) is a training-free probabilistic time-series forecaster that mixes same-season empirical draws with signed residual draws around a seasonal naive forecast. In an audited rolling-origin benchmark on the six time-series datasets where DeepNPTS was originally evaluated, CSP-Adaptive significantly outperforms DeepNPTS on every metric reported, including CRPS, normalized mean quantile loss, and empirical 95% coverage (mean 0.89 versus 0.66), while running over 500 times faster on CPU. The paper notes that DeepNPTS coverage failures are especially severe in the worst windows where no horizon in the multi-step forecast is covered, posing risks in safety-critical uses.
What carries the argument
Conformal Seasonal Pools, a sampling procedure that draws from same-season historical observations combined with signed residuals around a seasonal naive point forecast to generate calibrated probabilistic predictions and intervals.
If this is right
- CSP produces prediction intervals with empirical coverage much closer to the nominal 95 percent level than trained deep models.
- No training step is needed, so forecasts can be generated immediately with far lower computational cost.
- Deep learning forecasters can fail to cover the truth across entire multi-step trajectories in many windows.
- Training-free conformal methods should serve as mandatory baselines when new non-parametric forecasters are evaluated.
Where Pith is reading between the lines
- The strong performance on seasonal data suggests the pooling idea may extend naturally to other domains with repeating cycles such as retail demand or environmental monitoring.
- If the exchangeability holds more broadly, the results indicate that added model complexity does not automatically improve calibration in time-series uncertainty quantification.
- The speed advantage could allow CSP to be used inside larger ensembles or updated more frequently in operational settings.
Load-bearing premise
The rolling-origin evaluation on the six datasets is free of data leakage and the time-series observations satisfy the exchangeability conditions required for conformal prediction to deliver the stated coverage.
What would settle it
Re-running the rolling-origin experiments on the same six datasets and finding that CSP-Adaptive no longer produces statistically significant coverage improvements over DeepNPTS or that its empirical coverage drops well below the nominal level.
Figures
read the original abstract
We propose Conformal Seasonal Pools (CSP), a training-free probabilistic time-series forecaster that mixes same-season empirical draws with signed residual draws around a seasonal naive forecast. In an audited rolling-origin benchmark on the six time-series datasets where DeepNPTS was originally evaluated (electricity, exchange_rate, solar_energy, taxi, traffic, wikipedia), CSP-Adaptive significantly outperforms DeepNPTS on every metric we report -- CRPS (per-window paired Wilcoxon $p \approx 4 \times 10^{-10}$), normalized mean quantile loss ($p \approx 7 \times 10^{-10}$), and empirical 95% coverage ($p \approx 8 \times 10^{-45}$, mean 0.89 vs 0.66) -- while running over 500x faster on CPU. Coverage is the most decision-critical of these: a 0.95 nominal interval that contains the truth in only ~66% of cases fails the basic calibration desideratum and would not survive deployment in safety- or decision-critical settings. The failure mode is also more severe than aggregate coverage suggests: in the worst 10% of windows, DeepNPTS's prediction interval covers none of the H forecast horizons -- the entire multi-step trajectory misses the truth at every step simultaneously. This poses serious risk in safety- and decision-critical applications such as healthcare, finance, energy operations, and autonomous systems, where prediction intervals that systematically miss the truth across the entire planning horizon translate directly into misclassified patients, regulatory capital failures, grid imbalances, and safety-case violations. CSP achieves all of this with no learned parameters and no training. We argue training-free conformal samplers should be mandatory baselines when evaluating learned non-parametric forecasters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Conformal Seasonal Pools (CSP), a training-free probabilistic time-series forecaster that mixes same-season empirical draws with signed residual draws around a seasonal naive forecast and applies conformal quantile selection. In a rolling-origin benchmark on the six datasets previously used for DeepNPTS (electricity, exchange_rate, solar_energy, taxi, traffic, wikipedia), the adaptive variant CSP-Adaptive is reported to outperform DeepNPTS on CRPS (paired Wilcoxon p ≈ 4 × 10^{-10}), normalized mean quantile loss (p ≈ 7 × 10^{-10}), and empirical 95% coverage (p ≈ 8 × 10^{-45}, mean 0.89 vs. 0.66) while running over 500× faster on CPU. The paper positions training-free conformal methods as mandatory baselines for evaluating learned forecasters and stresses the practical risks of poor coverage in safety-critical domains.
Significance. If the empirical superiority and coverage results are robust to the exchangeability concerns, the work is significant as a simple, parameter-free baseline that exposes calibration failures in learned methods and runs orders of magnitude faster. The emphasis on coverage as a decision-critical metric and the use of paired statistical tests across multiple datasets and horizons are strengths; the absence of any learned parameters or training is a clear methodological advantage that could shift evaluation standards in the field.
major comments (2)
- [Abstract and method description] The coverage superiority claim rests on the assumption that conformal quantile selection delivers valid finite-sample coverage. However, the signed seasonal residuals are unlikely to be exchangeable due to residual serial correlation in real time-series data (electricity, traffic, etc.), which seasonal adjustment does not eliminate. This directly bears on the reported empirical coverage of 0.89 (vs. nominal 0.95) and the p-value comparison; the manuscript must explicitly address whether the guarantee holds or is only heuristic.
- [Experiments section] The rolling-origin benchmark description provides no details on data preprocessing steps, the precise definition and implementation of the 'adaptive' variant, or verification that no post-hoc window exclusions or leakage occurred. These omissions are load-bearing for the reproducibility of the CRPS, quantile loss, and coverage results and the Wilcoxon tests.
minor comments (2)
- [Abstract] The abstract states 'audited rolling-origin benchmark' without defining the audit criteria or providing the exact hyperparameter choices for CSP-Adaptive; this should be clarified for readers.
- [Method] Notation for the mixing of empirical draws and signed residuals could be made more precise with an equation showing the construction of the calibration scores.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments highlight important points on theoretical assumptions and experimental reproducibility that we address below. We have prepared revisions to clarify the heuristic nature of the coverage guarantees and to expand the experimental details for full reproducibility.
read point-by-point responses
-
Referee: [Abstract and method description] The coverage superiority claim rests on the assumption that conformal quantile selection delivers valid finite-sample coverage. However, the signed seasonal residuals are unlikely to be exchangeable due to residual serial correlation in real time-series data (electricity, traffic, etc.), which seasonal adjustment does not eliminate. This directly bears on the reported empirical coverage of 0.89 (vs. nominal 0.95) and the p-value comparison; the manuscript must explicitly address whether the guarantee holds or is only heuristic.
Authors: We agree that the finite-sample coverage guarantee of conformal prediction requires exchangeability, which is unlikely to hold exactly for signed seasonal residuals in the presence of residual serial correlation, even after seasonal adjustment. In the revised manuscript we will add an explicit subsection stating that the coverage guarantee is heuristic under temporal dependence rather than strictly valid. We will include supporting analysis of empirical coverage as a function of autocorrelation strength across the datasets and will frame the reported superiority (including the Wilcoxon tests) as an empirical result. This revision will accurately qualify the theoretical claim while retaining the practical demonstration that CSP-Adaptive achieves substantially better calibration than the learned baseline. revision: yes
-
Referee: [Experiments section] The rolling-origin benchmark description provides no details on data preprocessing steps, the precise definition and implementation of the 'adaptive' variant, or verification that no post-hoc window exclusions or leakage occurred. These omissions are load-bearing for the reproducibility of the CRPS, quantile loss, and coverage results and the Wilcoxon tests.
Authors: We acknowledge that the current experimental description lacks sufficient detail. In the revision we will expand the Experiments section with: (i) complete preprocessing pipelines for each of the six datasets, including normalization, missing-value handling, and seasonal decomposition; (ii) a precise algorithmic definition of CSP-Adaptive, specifying the adaptation rule (dynamic pool-size selection based on recent calibration performance without any learned parameters or training); and (iii) explicit verification that the rolling-origin procedure uses only past data, with no post-hoc window exclusions or leakage. We will also release the full implementation code upon acceptance to allow independent verification of the reported metrics and paired statistical tests. revision: yes
Circularity Check
No circularity in derivation or claims
full rationale
The paper proposes CSP as a training-free algorithm that mixes seasonal empirical draws with signed residuals from a naive forecast and applies standard conformal quantile selection. All reported performance claims (CRPS, quantile loss, coverage) are obtained from direct rolling-origin evaluation on six external public datasets against the independent baseline DeepNPTS, with statistical tests on those results. No equation, parameter, or 'prediction' in the abstract or described method is defined in terms of itself or fitted to the target metric; the construction uses off-the-shelf conformal machinery without self-referential reduction. The central empirical superiority therefore rests on observable benchmark outcomes rather than tautological re-expression of inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Time series exhibit sufficiently stable seasonal patterns to allow reuse of same-season historical observations for sampling
- domain assumption Residuals around the seasonal naive forecast satisfy approximate exchangeability required for conformal coverage guarantees
Reference graph
Works this paper leans on
-
[1]
Alexandrov, Alexander and Benidis, Konstantinos and Bohlke-Schneider, Michael and Flunkert, Valentin and Gasthaus, Jan and Januschowski, Tim and Maddix, Danielle C. and Rangapuram, Syama Sundar and Salinas, David and Schulz, Jasper and Stella, Lorenzo and Turkmen, Ali Caner and Wang, Yuyang , title =. Journal of Machine Learning Research , volume =. 2020 , url =
work page 2020
- [2]
-
[3]
Challu, Cristian and Olivares, Kin G. and Oreshkin, Boris N. and Garza, Federico and Mergenthaler-Canseco, Max and Dubrawski, Artur , title =. Proceedings of the AAAI Conference on Artificial Intelligence , volume =. 2023 , doi =
work page 2023
-
[4]
Adaptive Conformal Inference Under Distribution Shift , year =
Gibbs, Isaac and Cand\`. Adaptive Conformal Inference Under Distribution Shift , year =. 2106.00170 , archivePrefix =
- [5]
- [6]
-
[7]
Annual Review of Statistics and Its Application , volume =
Gneiting, Tilmann and Katzfuss, Matthias , title =. Annual Review of Statistics and Its Application , volume =. 2014 , doi =
work page 2014
-
[8]
Godahewa, Rakshitha and Bergmeir, Christoph and Webb, Geoffrey I. and Hyndman, Rob J. and Montero-Manso, Pablo , title =. 2021 , eprint =
work page 2021
- [9]
-
[10]
Weather and Forecasting , volume =
Hersbach, Hans , title =. Weather and Forecasting , volume =. 2000 , doi =
work page 2000
-
[11]
and Wasserman, Larry , title =
Lei, Jing and G'Sell, Max and Rinaldo, Alessandro and Tibshirani, Ryan J. and Wasserman, Larry , title =. Journal of the American Statistical Association , volume =. 2018 , doi =
work page 2018
-
[12]
and Loeff, Nicolas and Pfister, Tomas , title =
Lim, Bryan and Arik, Sercan \"O. and Loeff, Nicolas and Pfister, Tomas , title =. International Journal of Forecasting , volume =. 2021 , doi =
work page 2021
-
[13]
International Journal of Forecasting , volume =
Makridakis, Spyros and Spiliotis, Evangelos and Assimakopoulos, Vassilios , title =. International Journal of Forecasting , volume =. 2018 , doi =
work page 2018
-
[14]
International Journal of Forecasting , volume =
Makridakis, Spyros and Spiliotis, Evangelos and Assimakopoulos, Vassilios , title =. International Journal of Forecasting , volume =. 2020 , doi =
work page 2020
- [15]
-
[16]
and Orenstein, Paulo and Ramos, Thiago and Romano, Jo
Oliveira, Roberto I. and Orenstein, Paulo and Ramos, Thiago and Romano, Jo. Split Conformal Prediction and Non-Exchangeable Data , journal =. 2024 , url =
work page 2024
-
[17]
and Carpov, Dmitri and Chapados, Nicolas and Bengio, Yoshua , title =
Oreshkin, Boris N. and Carpov, Dmitri and Chapados, Nicolas and Bengio, Yoshua , title =. International Conference on Learning Representations , year =
-
[18]
Inductive Confidence Machines for Regression
Papadopoulos, Harris and Proedrou, Kostas and Vovk, Vladimir and Gammerman, Alex , title =. Machine Learning: ECML 2002 , series =. 2002 , publisher =. doi:10.1007/3-540-36755-1_29 , url =
-
[19]
and Sheng, Zhenli and Yang, Bin , title =
Qiu, Xiangfei and Hu, Jilin and Zhou, Lekui and Wu, Xingjian and Du, Junyang and Zhang, Buang and Guo, Chenjuan and Zhou, Aoying and Jensen, Christian S. and Sheng, Zhenli and Yang, Bin , title =. Proceedings of the VLDB Endowment , volume =. 2024 , doi =
work page 2024
-
[20]
Rangapuram, Syama Sundar and Gasthaus, Jan and Stella, Lorenzo and Flunkert, Valentin and Salinas, David and Wang, Yuyang and Januschowski, Tim , title =. 2023 , eprint =
work page 2023
-
[21]
Conformalized Quantile Regression , booktitle =
Romano, Yaniv and Patterson, Evan and Cand\`. Conformalized Quantile Regression , booktitle =. 2019 , url =
work page 2019
-
[22]
Journal of Machine Learning Research , volume =
Shafer, Glenn and Vovk, Vladimir , title =. Journal of Machine Learning Research , volume =. 2008 , url =
work page 2008
-
[23]
Vovk, Vladimir and Gammerman, Alex and Shafer, Glenn , title =. 2005 , doi =
work page 2005
-
[24]
Vovk, Vladimir and Shen, Jieli and Manokhin, Valery and Xie, Min-ge , title =. Proceedings of the Sixth Workshop on Conformal and Probabilistic Prediction and Applications , series =. 2017 , publisher =
work page 2017
-
[25]
Braverman Readings in Machine Learning
Vovk, Vladimir and Nouretdinov, Ilia and Manokhin, Valery and Gammerman, Alex , title =. Braverman Readings in Machine Learning. Key Ideas from Inception to Current State , series =. 2018 , publisher =. doi:10.1007/978-3-319-99492-5_4 , url =
-
[26]
Vovk, Vladimir and Nouretdinov, Ilia and Manokhin, Valery and Gammerman, Alex , title =. Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications , series =. 2018 , publisher =
work page 2018
-
[27]
Vovk, Vladimir and Nouretdinov, Ilia and Manokhin, Valery and Gammerman, Alex , title =. Neurocomputing , volume =. 2020 , doi =
work page 2020
-
[28]
Proceedings of the 38th International Conference on Machine Learning , series =
Xu, Chen and Xie, Yao , title =. Proceedings of the 38th International Conference on Machine Learning , series =. 2021 , publisher =
work page 2021
-
[29]
IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =
Xu, Chen and Xie, Yao , title =. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =. 2023 , doi =
work page 2023
-
[30]
Advances in Neural Information Processing Systems , series =
Zhang, Jiawen and Wen, Xumeng and Zhang, Zhenwei and Zheng, Shun and Li, Jia and Bian, Jiang , title =. Advances in Neural Information Processing Systems , series =. 2024 , url =
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.