Leave a Window Out: Modifying the Jackknife for Predictive Inference in Time Series
Pith reviewed 2026-06-29 05:22 UTC · model grok-4.3
The pith
The vanilla jackknife can lose coverage guarantees in time series data even with mild dependence, but a leave-a-window-out modification restores valid coverage when the predictor is stable.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The vanilla leave-one-out jackknife can suffer arbitrary loss of coverage even in canonical time series models with mild temporal dependence. As a remedy, the leave-a-window-out (LWO) method achieves valid coverage provided that the model-fitting procedure satisfies mild stability properties. The proofs quantify the degree to which the data departs from cyclic exchangeability using newly introduced coefficients.
What carries the argument
The leave-a-window-out (LWO) jackknife, which modifies the standard jackknife by excluding a contiguous window of observations to account for temporal dependence while preserving coverage under stability.
Load-bearing premise
The model-fitting procedure satisfies mild stability properties, meaning the predictor changes only modestly when a small window of training data is altered.
What would settle it
A time series dataset paired with a demonstrably stable predictor where the empirical coverage of the LWO intervals falls below the nominal level would falsify the coverage claim.
Figures
read the original abstract
Conformal prediction methods enjoy strong theoretical and empirical predictive inference performance, provided the data is exchangeable and is treated symmetrically during training. However, these assumptions are impractical in many settings, such as time series, where temporal dependence violates exchangeability and it is preferable to use predictors that leverage dependence by treating data asymmetrically. Recent work shows that split conformal prediction is robust to these issues, but sample splitting can reduce accuracy, motivating the study of methods that do not rely on data splitting in the time series setting. In this work, we show that the vanilla leave-one-out jackknife can suffer arbitrary loss of coverage even in canonical time series models with mild temporal dependence. As a remedy, we propose a modification tailored to such settings, which we term the leave-a-window-out (LWO) method, and show that it can achieve valid coverage provided that the model-fitting procedure satisfies mild stability properties. Our proofs are based on quantifying the degree to which the data departs from cyclic exchangeability, which we introduce new coefficients to measure. Experiments on time series demonstrate that our method often enjoys valid coverage when the vanilla jackknife fails to cover, while producing much narrower intervals than split conformal prediction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that the vanilla leave-one-out jackknife can suffer arbitrary loss of coverage even in canonical time series models with mild temporal dependence. As a remedy, it proposes the leave-a-window-out (LWO) method and shows that it achieves valid coverage provided the model-fitting procedure satisfies mild stability properties. The proofs rely on new coefficients quantifying departure from cyclic exchangeability. Experiments on time series data indicate that LWO often attains valid coverage where the jackknife fails while yielding narrower intervals than split conformal prediction.
Significance. If the stability conditions are verified to hold for the predictors employed and the coverage bounds are rigorously derived, the work would meaningfully extend conformal prediction techniques to dependent data without requiring sample splitting, addressing a practical limitation in time-series settings.
major comments (1)
- [theoretical results and experiments section] The central coverage guarantee for LWO is obtained by converting the new cyclic-exchangeability coefficients into a bound only when the model-fitting procedure satisfies the mild stability properties. The manuscript provides no verification that this stability holds for the concrete predictors (e.g., AR models or neural networks) used in the reported time-series experiments; without such verification the conversion step fails even if the coefficients themselves are small.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and for highlighting the connection between the theoretical conditions and the experimental results. We address the major comment below.
read point-by-point responses
-
Referee: [theoretical results and experiments section] The central coverage guarantee for LWO is obtained by converting the new cyclic-exchangeability coefficients into a bound only when the model-fitting procedure satisfies the mild stability properties. The manuscript provides no verification that this stability holds for the concrete predictors (e.g., AR models or neural networks) used in the reported time-series experiments; without such verification the conversion step fails even if the coefficients themselves are small.
Authors: We agree that the coverage guarantee is obtained only when both the cyclic-exchangeability coefficients are controlled and the model-fitting procedure satisfies the stated stability properties. The manuscript introduces stability as a sufficient condition for the bound but does not provide empirical verification that this condition holds for the specific AR models or neural networks used in the experiments. The experiments instead report empirical coverage and interval widths to illustrate practical behavior. We will revise the manuscript to explicitly distinguish the theoretical guarantee from the empirical results and to note that verifying stability for concrete predictors remains an important direction for future work. revision: yes
Circularity Check
No circularity; coverage proof relies on external stability premise and new coefficients
full rationale
The derivation introduces new coefficients to quantify departure from cyclic exchangeability and proves LWO coverage by converting those coefficients under the stated premise that the fitting procedure satisfies mild stability. This premise is an assumption external to the result rather than a quantity fitted or defined inside the paper's equations. No load-bearing step reduces by construction to a self-citation, a renamed known result, or an input that is statistically forced; the central claim therefore remains independent of the paper's own fitted quantities or prior self-referential theorems.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The model-fitting procedure satisfies mild stability properties.
invented entities (1)
-
Coefficients measuring departure from cyclic exchangeability
no independent evidence
Forward citations
Cited by 1 Pith paper
-
Sequential statistical inference for Large Language Models: Representation, validity, and monitoring
Argues for modeling LLM interactions as dependent stochastic processes to enable valid sequential uncertainty quantification and change-point monitoring for trustworthiness properties.
Reference graph
Works this paper leans on
-
[1]
Theoretical Foundations of Conformal Prediction
URLhttps://proceedings.neurips.cc/paper_files/paper/ 2023/hash/47f2fad8c1111d07f83c91be7870f8db-Abstract-Conference.html. Anastasios N Angelopoulos, Rina Foygel Barber, and Stephen Bates. Theoretical foundations of conformal prediction.arXiv preprint arXiv:2411.11824,
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[2]
doi: 10.1007/978-1-4612-2642-0
ISBN 0-387-94214-9. doi: 10.1007/978-1-4612-2642-0. URLhttps://doi.org/10. 1007/978-1-4612-2642-0. Bradley Efron and Gail Gong. A leisurely look at the bootstrap, the jackknife, and cross-validation.The American Statistician, 37(1):36–48,
-
[3]
Tao Hong, Pierre Pinson, Shu Fan, Hamidreza Zareipour, Alberto Troccoli, and Rob J Hyndman
doi: 10.1016/j.patcog.2021.108496. Tao Hong, Pierre Pinson, Shu Fan, Hamidreza Zareipour, Alberto Troccoli, and Rob J Hyndman. Proba- bilistic energy forecasting: Global energy forecasting competition 2014 and beyond.International Journal of forecasting, 32(3):896–913,
-
[4]
Jonghyeok Lee, Chen Xu, and Yao Xie
URLhttps:// proceedings.neurips.cc/paper/2020/hash/2b346a0aa375a07f5a90a344a61416c4-Abstract.html. Jonghyeok Lee, Chen Xu, and Yao Xie. Kernel-based optimally weighted conformal time-series prediction. InThe Thirteenth International Conference on Learning Representations,
2020
-
[5]
URLhttps://arxiv.org/abs/2311.04295
doi: 10.1214/ 25-AOS2510. URLhttps://arxiv.org/abs/2311.04295. Henrik Linusson, Ulf Norinder, Henrik Bostr¨ om, Ulf Johansson, and Tuve L¨ ofstr¨ om. Efficient conformity calibration of random forests.Expert Systems with Applications, 154:113335,
-
[6]
Financial time series forecasting with deep learning: A systematic literature review: 2005–2019.Applied soft computing, 90:106181,
Omer Berat Sezer, Mehmet Ugur Gudelek, and Ahmet Murat Ozbayoglu. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019.Applied soft computing, 90:106181,
2005
-
[7]
doi: 10.1214/22-AOS2250. Mervyn Stone. Cross-validatory choice and assessment of statistical predictions.Journal of the Royal Statistical Society: Series B (Methodological), 36(2):111–133,
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.