Improving Power by Conditioning on Less in Post-selection Inference for Changepoints
Pith reviewed 2026-05-24 10:22 UTC · model grok-4.3
The pith
Conditioning on less information yields more powerful valid p-values for changepoint detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By deliberately conditioning on less of the information that determined which changepoints were selected, one obtains an ideal selective p-value that cannot be evaluated in closed form but admits an unbiased Monte Carlo estimator; this estimator is valid for any finite number of perturbations and empirically yields higher power while remaining easy to implement by re-applying any existing post-selection method to each perturbed data set.
What carries the argument
The Monte Carlo approximation to the ideal selective p-value, obtained by generating perturbations of the data set and re-applying the post-selection inference procedure to each one.
If this is right
- The Monte Carlo p-values are exactly valid for any finite sample size.
- Substantial power gains occur even with very small numbers of Monte Carlo perturbations.
- Implementation requires only generating perturbations and re-running the existing post-selection routine on each.
- Application to genomic GC-content data increases the number of significant changepoints from 17 to 27.
Where Pith is reading between the lines
- The same reduction in conditioning information could be explored in other post-selection settings that currently condition on the full selection event.
- Because validity holds for any sample size, users can trade computational budget directly against power without recalibrating thresholds.
- If the underlying changepoint detector is computationally cheap, the Monte Carlo overhead remains modest even for genome-scale sequences.
Load-bearing premise
Perturbations of the original data set generate samples whose distribution matches the conditional law required by the ideal selective p-value.
What would settle it
In repeated simulations under the global null of no changepoints, the fraction of Monte Carlo p-values falling below a nominal level alpha exceeds alpha by more than binomial sampling variability.
Figures
read the original abstract
Post-selection inference has recently been proposed as a way of quantifying uncertainty about detected changepoints. The idea is to run a changepoint detection algorithm, and then re-use the same data to perform a test for a change near each of the detected changes. By defining the p-value for the test appropriately, so that it is conditional on the information used to choose the test, this approach will produce valid p-values. We show how to improve the power of these procedures by conditioning on less information. This gives rise to an ideal selective p-value that is intractable but can be approximated by Monte Carlo. We show that for any Monte Carlo sample size, this procedure produces valid p-values, and empirically that noticeable increase in power is possible with only very modest Monte Carlo sample sizes. Our procedure is easy to implement given existing post-selection inference methods, as we just need to generate perturbations of the data set and re-apply the post-selection method to each of these. On genomic data consisting of human GC content, our procedure increases the number of significant changepoints that are detected from e.g. 17 to 27, when compared to existing methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an improvement to post-selection inference for changepoint detection by conditioning on less information than standard selective p-values. This yields an ideal (intractable) selective p-value that is approximated via Monte Carlo by generating B perturbations of the observed data, re-running the full changepoint procedure on each, and constructing a rank-based p-value from the resulting test statistics. The central claims are that the Monte Carlo procedure produces exactly valid selective p-values for any finite B and that modest B suffices for noticeable power gains, as illustrated on human GC-content genomic data where the number of detected significant changepoints rises from 17 to 27.
Significance. If the validity argument holds, the approach would be a practical and low-cost extension of existing post-selection methods for changepoints, requiring only data perturbations and re-application of the base procedure. The empirical demonstration of power improvement with small B is a concrete strength, and the method's modularity (building directly on prior selective-inference code) aids reproducibility. However, the significance is tempered by the need to confirm that the perturbation mechanism exactly reproduces the conditional null distribution given selection.
major comments (2)
- [Monte Carlo approximation procedure (likely §3–4)] The validity claim for any Monte Carlo size B rests on the perturbed replicates being exchangeable with the observed statistic under the null conditional on the selection event. The manuscript implements this via data perturbations followed by re-running the changepoint algorithm, but does not explicitly verify or prove that the chosen perturbation distribution coincides with the law of the data under the null given selection (particularly when the selection event is discrete). If this match fails, the rank statistic is no longer uniform and type-I error control is lost even though the algebraic form of the p-value is preserved.
- [Validity theorem / proof of Monte Carlo validity] The abstract and introduction assert that the procedure 'produces valid p-values' for any B. The supporting argument should be located in the section deriving the selective p-value; if it relies on an implicit assumption that perturbations are drawn from the exact conditional distribution, this needs to be stated as a theorem with the precise conditions on the perturbation kernel.
minor comments (2)
- [Implementation details] Clarify the exact form of the perturbation distribution (e.g., additive Gaussian noise with what variance?) and whether it is the same for all candidate changepoint locations.
- [Empirical results / genomic data example] In the genomic application, report the specific changepoint detection algorithm, the definition of the selection event, and the value of B used to obtain the increase from 17 to 27 significant changepoints.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for identifying the need to make the validity argument for the Monte Carlo procedure more explicit. We address each major comment below and will revise the manuscript to strengthen the presentation of the supporting theorem.
read point-by-point responses
-
Referee: The validity claim for any Monte Carlo size B rests on the perturbed replicates being exchangeable with the observed statistic under the null conditional on the selection event. The manuscript implements this via data perturbations followed by re-running the changepoint algorithm, but does not explicitly verify or prove that the chosen perturbation distribution coincides with the law of the data under the null given selection (particularly when the selection event is discrete). If this match fails, the rank statistic is no longer uniform and type-I error control is lost even though the algebraic form of the p-value is preserved.
Authors: We agree that an explicit statement of the conditions is warranted. The perturbation mechanism is constructed so that, conditional on the selection event, the observed statistic and the B perturbed statistics are exchangeable under the null (by drawing perturbations from the same conditional law that defines the selective p-value). This ensures the rank-based p-value is exactly uniform for any finite B, analogous to a Monte Carlo test. However, the current text leaves the precise matching of the perturbation kernel to the conditional distribution implicit, especially for discrete selection events. We will add a formal theorem in the section deriving the Monte Carlo approximation that states the required conditions on the perturbation kernel for exchangeability to hold. revision: yes
-
Referee: The abstract and introduction assert that the procedure 'produces valid p-values' for any B. The supporting argument should be located in the section deriving the selective p-value; if it relies on an implicit assumption that perturbations are drawn from the exact conditional distribution, this needs to be stated as a theorem with the precise conditions on the perturbation kernel.
Authors: The validity claim for any B follows directly from the exchangeability of the observed and perturbed statistics under the conditional null, which is ensured by the choice of perturbation kernel matching the law of the data given selection. We will revise the manuscript to include an explicit theorem (with the precise conditions on the kernel) in the section on the selective p-value derivation, rather than leaving the argument implicit. This will make the finite-B validity self-contained and address the concern about discrete selection events. revision: yes
Circularity Check
No significant circularity; derivation self-contained
full rationale
The paper's derivation extends prior post-selection inference by conditioning on less information to obtain an ideal selective p-value, then approximates it via Monte Carlo perturbations of the data followed by re-application of the changepoint procedure. Validity for any Monte Carlo sample size B is claimed via a rank-based construction that does not reduce to a fitted parameter or self-definition by construction. No load-bearing self-citations, ansatzes smuggled via citation, or uniqueness theorems imported from the authors' prior work are exhibited in the provided text that would make the central result equivalent to its inputs. The perturbations are external to the fitting process, and the method remains falsifiable via the conditional distribution argument.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard assumptions of the changepoint model and the fixed detection algorithm allow for valid selective inference.
Reference graph
Works this paper leans on
-
[1]
Aston, J. A. and Kirch, C. (2012). Evaluating stationarity via change-point alternatives with applications to fMRI data. The Annals of Applied Statistics , 6(4):1906--1948
work page 2012
-
[2]
Baranowski, R., Chen, Y., and Fryzlewicz, P. (2019). Narrowest-over-threshold detection of multiple change points and change-point-like features. Journal of the Royal Statistical Society: Series B , 81(3):649--672
work page 2019
-
[3]
F., Killick, R., Mount, S., and Tratt, L
Barrett, E., Bolz-Tereick, C. F., Killick, R., Mount, S., and Tratt, L. (2017). Virtual machine warmup blows hot and cold. Proceedings of the ACM on Programming Languages , 1(OOPSLA):1--27
work page 2017
-
[4]
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B , 57(1):289--300
work page 1995
-
[5]
Berk, R., Brown, L., Buja, A., Zhang, K., and Zhao, L. (2013). Valid post-selection inference. The Annals of Statistics , 41:802--837
work page 2013
-
[6]
V., Braun, R., and M \"u ller, H.-G
Braun, J. V., Braun, R., and M \"u ller, H.-G. (2000). Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation. Biometrika , 87(2):301--314
work page 2000
-
[7]
Chen, Y. T., Jewell, S. W., and Witten, D. M. (2023). Quantifying uncertainty in spikes estimated from calcium imaging data. Biostatistics , 24(2):481--501
work page 2023
-
[8]
Chen, Y. T. and Witten, D. M. (2023). Selective inference for k-means clustering. Journal of Machine Learning Research , 24(152):1--41
work page 2023
-
[9]
Dubey, P. and M \"u ller, H.-G. (2020). Fr \'e chet change-point detection. The Annals of Statistics , 48(6):3312--3335
work page 2020
-
[10]
Duy, V. N. L. and Takeuchi, I. (2022). More powerful conditional selective inference for generalized lasso by parametric programming. The Journal of Machine Learning Research , 23(1):13544--13580
work page 2022
-
[11]
Duy, V. N. L., Toda, H., Sugiyama, R., and Takeuchi, I. (2020). Computing valid p-value for optimal changepoint by selective inference using dynamic programming. Advances in Neural Information Processing Systems , 33:11356--11367
work page 2020
-
[12]
Eichinger, B. and Kirch, C. (2018). A MOSUM procedure for the estimation of multiple random change points. Bernoulli , 24(1):526--564
work page 2018
-
[13]
Fang, X., Li, J., and Siegmund, D. (2020). Segmentation and estimation of change-point models: false positive control and confidence regions. The Annals of Statistics , 48(3):1615--1647
work page 2020
-
[14]
Fearnhead, P. (2006). Exact and efficient B ayesian inference for multiple changepoint problems. Statistics and Computing , 16(2):203--213
work page 2006
-
[15]
Fearnhead, P. and Fryzlewicz, P. (2022). Detecting a single change-point. arXiv preprint arXiv:2210.07066
-
[16]
Fearnhead, P. and Rigaill, G. (2019). Changepoint detection in the presence of outliers. Journal of the American Statistical Association , 114(525):169--183
work page 2019
-
[17]
Fearnhead, P. and Rigaill, G. (2020). Relating and comparing methods for detecting changes in mean. Stat , 9(1):e291
work page 2020
-
[18]
Fithian, W., Sun, D., and Taylor, J. (2014). Optimal inference after model selection. arXiv:1410.2597
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[19]
Frick, K., Munk, A., and Sieling, H. (2014). Multiscale change point inference. Journal of the Royal Statistical Society: Series B , 76(3):495--580
work page 2014
-
[20]
Fryzlewicz, P. (2014). Wild binary segmentation for multiple change-point detection. The Annals of Statistics , 42(6):2243--2281
work page 2014
- [21]
-
[22]
Fryzlewicz, P. (2023). Narrowest significance pursuit: inference for multiple change-points in linear models. Journal of the American Statistical Association , pages 1--14
work page 2023
-
[23]
Gao, L. L., Bien, J., and Witten, D. (2022). Selective inference for hierarchical clustering. Journal of the American Statistical Association , pages 1--11
work page 2022
-
[24]
Hao, N., Niu, Y. S., and Zhang, H. (2013). Multiple change-point detection via a screening and ranking algorithm. Statistica Sinica , 23(4):1553
work page 2013
-
[25]
Heard, N. A. and Turcotte, M. J. (2014). Monitoring a device in a communication network. In Data Analysis for Network Cyber-security , pages 151--188. World Scientific
work page 2014
-
[26]
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics , pages 65--70
work page 1979
-
[27]
Hyun, S., G’Sell, M., and Tibshirani, R. J. (2018). Exact post-selection inference for the generalized lasso path. Electronic Journal of Statistics , 12(1):1053--1097
work page 2018
-
[28]
Z., G'Sell, M., and Tibshirani, R
Hyun, S., Lin, K. Z., G'Sell, M., and Tibshirani, R. J. (2021). Post-selection inference for changepoint detection algorithms with application to copy number variation data. Biometrics , 77(3):1037--1049
work page 2021
-
[29]
Jewell, S., Fearnhead, P., and Witten, D. (2022). Testing for a change in mean after changepoint detection. Journal of the Royal Statistical Society: Series B , 84(4):1082--1104
work page 2022
-
[30]
Jewell, S. W., Hocking, T. D., Fearnhead, P., and Witten, D. M. (2020). Fast nonconvex deconvolution of calcium imaging data. Biostatistics , 21(4):709--726
work page 2020
-
[31]
Killick, R., Fearnhead, P., and Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association , 107(500):1590--1598
work page 2012
-
[32]
Kim, S.-J., Koh, K., Boyd, S., and Gorinevsky, D. (2009). l_1 trend filtering. SIAM review , 51(2):339--360
work page 2009
-
[33]
Kuchibhotla, A. K., Kolassa, J. E., and Kuffner, T. A. (2022). Post-selection inference. Annual Review of Statistics and Its Application , 9:505--527
work page 2022
-
[34]
Li, H., Munk, A., and Sieling, H. (2016). FDR -control in multiscale change-point segmentation. Electronic Journal of Statistics , 10(1):918--959
work page 2016
-
[35]
Liu, K., Markovic, J., and Tibshirani, R. (2018). More powerful post-selection inference, with application to the lasso. arXiv:1801.09037
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[36]
Maidstone, R., Hocking, T., Rigaill, G., and Fearnhead, P. (2017). On optimal multiple changepoint algorithms for large data. Statistics and Computing , 27(2):519--533
work page 2017
-
[37]
Maleki, S., Bingham, C., and Zhang, Y. (2016). Development and realization of changepoint analysis for the detection of emerging faults on industrial systems. IEEE Transactions on Industrial Informatics , 12(3):1180--1187
work page 2016
-
[38]
Meier, A., Kirch, C., and Cho, H. (2021). mosum: A package for moving sums in change-point analysis. Journal of Statistical Software , 97:1--42
work page 2021
-
[39]
Neufeld, A. C., Gao, L. L., and Witten, D. M. (2022). Tree-values: selective inference for regression trees. Journal of Machine Learning Research , 23(305):1--43
work page 2022
-
[40]
Olshen, A. B., Venkatraman, E. S., Lucito, R., and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics , 5(4):557--572
work page 2004
-
[41]
Pein, F., Sieling, H., and Munk, A. (2017). Heterogeneous change point inference. Journal of the Royal Statistical Society: Series B , 79(4):1207--1227
work page 2017
-
[42]
Reeves, J., Chen, J., Wang, X. L., Lund, R., and Lu, Q. Q. (2007). A review and comparison of changepoint detection techniques for climate data. Journal of Applied Meteorology and Climatology , 46(6):900--915
work page 2007
-
[43]
Rinaldo, A., Wasserman, L., and G’Sell, M. (2019). Bootstrapping and sample splitting for high-dimensional, assumption-lean inference. The Annals of Statistics , 47(6):3438--3469
work page 2019
- [44]
-
[45]
Scott, A. J. and Knott, M. (1974). A cluster analysis method for grouping means in the analysis of variance. Biometrics , 30:507--512
work page 1974
-
[46]
Shi, X., Beaulieu, C., Killick, R., and Lund, R. (2022a). Changepoint detection: An analysis of the central E ngland temperature series. Journal of Climate , 35(19):2729--2742
-
[47]
Shi, X., Gallagher, C., Lund, R., and Killick, R. (2022b). A comparison of single and multiple changepoint techniques for time series data. Computational Statistics & Data Analysis , 170:107433
-
[48]
Song, H. and Chen, H. (2022). Asymptotic distribution-free changepoint detection for data with repeated observations. Biometrika , 109(3):783--798
work page 2022
-
[49]
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., and Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B , 67(1):91--108
work page 2005
-
[50]
Tibshirani, R. J. (2014). Adaptive piecewise polynomial estimation via trend filtering. The Annals of Statistics , 42(1):285--323
work page 2014
-
[51]
Truong, C., Oudre, L., and Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Processing , 167:107299
work page 2020
-
[52]
Wang, D., Yu, Y., and Rinaldo, A. (2021). Optimal change point detection and localization in sparse dynamic networks. The Annals of Statistics , 49(1):203--232
work page 2021
-
[53]
Wang, T. and Samworth, R. J. (2018). High dimensional change point estimation via sparse projection. Journal of the Royal Statistical Society: Series B , 80(1):57--83
work page 2018
-
[54]
Zhao, S., Witten, D., and Shojaie, A. (2021). In defense of the indefensible: A very naive approach to high-dimensional inference. Statistical Science , 36(4):562--577
work page 2021
-
[55]
Zhao, Z., Jiang, F., and Shao, X. (2022). Segmenting time series via self-normalisation. Journal of the Royal Statistical Society: Series B , 84(5):1699--1725
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.