Bootstrap consistency for general double/debiased machine learning estimators
Pith reviewed 2026-05-10 06:09 UTC · model grok-4.3
The pith
Bootstrap methods are valid for double/debiased machine learning estimators under exactly the conditions already required for their asymptotic normality.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under exactly the same conditions required for the validity of DML itself, the bootstrap law converges conditionally weakly to the sampling law of the original estimator, and this holds for general exchangeably weighted resampling schemes with Efron's bootstrap as a special case.
What carries the argument
Neyman-orthogonal scores with cross-fitting, which remove the need for Donsker-type conditions and allow the bootstrap to track the estimator's limiting distribution.
Load-bearing premise
The DML estimator must satisfy the Neyman-orthogonality and cross-fitting rate conditions that already make it asymptotically normal.
What would settle it
A data-generating process in which the DML estimator is asymptotically normal yet the conditional distribution of the bootstrap version fails to converge to the same limit.
read the original abstract
Double/debiased machine learning (DML) provides a general framework for inference with high-dimensional or otherwise complex nuisance parameters by combining Neyman-orthogonal scores with cross-fitting, thereby circumventing classical Donsker-type conditions in many modern machine-learning settings. Despite its strong empirical performance, bootstrap inference for DML estimators has received little theoretical justification. This is particularly noteworthy since bootstrap methods are suggested ad used for inference on DML estimators, even though bootstrap procedures can fail for estimators that are root-$n$ consistent and asymptotically normal. This paper fills this gap by establishing bootstrap validity for DML estimators under general exchangeably weighted resampling schemes, with Efron's bootstrap as a special case. Under exactly the same conditions required for the validity of DML itself, we prove that the bootstrap law converges conditionally weakly to the sampling law of the original estimator.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper establishes bootstrap consistency for general double/debiased machine learning (DML) estimators. It proves that, for exchangeably weighted resampling schemes (with Efron's bootstrap as a special case), the bootstrap law converges conditionally weakly to the sampling distribution of the DML estimator, under precisely the same Neyman-orthogonality, cross-fitting, and rate conditions already required for the asymptotic normality of the DML estimator itself.
Significance. If the result holds, this supplies the missing theoretical justification for bootstrap inference with DML estimators, which are widely used in practice for high-dimensional and complex nuisance settings. The paper merits credit for deriving the result as a direct extension without introducing extra assumptions beyond those for DML validity, and for covering general exchangeably weighted schemes rather than a single bootstrap variant.
minor comments (1)
- [Abstract] Abstract: the phrase 'suggested ad used' is a typographical error and should read 'suggested and used'.
Simulated Author's Rebuttal
We thank the referee for the positive and accurate summary of our manuscript on bootstrap consistency for general DML estimators. We appreciate the recommendation for minor revision and the recognition that the result holds under precisely the conditions already required for DML validity itself.
Circularity Check
No significant circularity
full rationale
The paper presents a direct mathematical proof that the bootstrap law converges conditionally weakly to the sampling law of the DML estimator, under precisely the same Neyman-orthogonality, cross-fitting, and rate conditions already required for DML asymptotic normality. No load-bearing step reduces the target bootstrap consistency result to a fitted parameter, a self-citation chain, or an ansatz smuggled from prior work by the same authors. The central claim is an independent theorem establishing validity for general exchangeably weighted resampling schemes, with no evidence that any equation or prediction is equivalent to its inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Neyman orthogonality of the score function
- domain assumption Cross-fitting to break dependence between nuisance estimation and score evaluation
Reference graph
Works this paper leans on
-
[1]
Abadie, A. and Imbens, G. W. (2008). On the failure of the bootstrap for matching estimators. Econometrica, 76(6):1537–1557
work page 2008
-
[2]
Andrews, D. W. (1994). Empirical process methods in econometrics.Handbook of Econometrics, 4:2247–2294
work page 1994
-
[3]
Beran, R. (1987). Prepivoting to reduce level error of confidence sets.Biometrika, 74(3):457–468
work page 1987
-
[4]
Cai, W. and van der Laan, M. (2020). Nonparametric bootstrap inference for the targeted highly adaptive least absolute shrinkage and selection operator (lasso) estimator.The International Journal of Biostatistics, 16(2):20170070
work page 2020
-
[5]
Cheng, G. and Huang, J. Z. (2010). Bootstrap consistency for general semiparametric M-estimation. The Annals of Statistics, 38(5):2884–2915
work page 2010
-
[6]
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68
work page 2018
-
[7]
Chernozhukov, V., Chetverikov, D., and Kato, K. (2014). Gaussian approximation of suprema of empirical processes.The Annals of Statistics, 42(4):1564–1597. 28
work page 2014
-
[8]
Diciccio, T. J. and Romano, J. P. (1988). A review of bootstrap confidence intervals.Journal of the Royal Statistical Society Series B: Statistical Methodology, 50(3):338–354
work page 1988
-
[9]
Dukes, O., Vansteelandt, S., and Whitney, D. (2024). On doubly robust inference for double machine learning in semiparametric regression.Journal of Machine Learning Research, 25(279):1–46
work page 2024
-
[10]
Efron, B. (1979). Bootstrap methods: Another look at the jackknife.The Annals of Statistics, 7(1):1–26
work page 1979
-
[11]
Fingerhut, N., Sesia, M., and Romano, Y. (2022). Coordinated double machine learning. InInter- national Conference on Machine Learning, pages 6499–6513. PMLR
work page 2022
-
[12]
Gonnet, G. H. (1981). Expected length of the longest probe sequence in hash code searching. Journal of the ACM (JACM), 28(2):289–304. Hájek, J. (1961). Some extensions of the Wald-Wolfowitz-Noether theorem.The Annals of Mathe- matical Statistics, 32(2):506–523
work page 1981
-
[13]
Hall, P. (1988). Theoretical comparison of bootstrap confidence intervals.The Annals of Statistics, 16(3):927–953
work page 1988
-
[14]
Imbens, G. W. (2024). Causal inference in the social sciences.Annual Review of Statistics and Its Application, 11:123–152
work page 2024
-
[15]
Kosorok, M. R. (2008).Introduction to Empirical Processes and Semiparametric Inference. Springer
work page 2008
-
[16]
Lin, Z., Ding, P., and Han, F. (2023). Estimation based on nearest neighbor matching: from density ratio to average treatment effect.Econometrica, 91(6):2187–2217
work page 2023
-
[17]
Lin, Z. and Han, F. (2024). On the failure of the bootstrap for Chatterjee’s rank correlation. Biometrika, 111(3):1063–1070
work page 2024
-
[18]
Lin, Z. and Han, F. (2025). On regression-adjusted imputation estimators of the average treatment effect.Journal of Econometrics, 251:106080
work page 2025
-
[19]
Lin, Z. and Han, F. (2026). On the consistency of bootstrap for matching estimators.Biometrika, 113(1):asag005
work page 2026
-
[20]
Luenberger, D. G. (1997).Optimization by Vector Space Methods. John Wiley and Sons
work page 1997
-
[21]
Mason, D. M. and Newton, M. A. (1992). A rank statistics approach to the consistency of a general bootstrap.The Annals of Statistics, 20(3):1611–1624
work page 1992
-
[22]
Praestgaard, J. and Wellner, J. A. (1993). Exchangeably weighted bootstraps of the general empir- ical process.The Annals of Probability, 21(4):2053–2086. 29
work page 1993
-
[23]
Raab, M. and Steger, A. (1998). “Balls into bins”—a simple and tight analysis. InInternational Workshop on Randomization and Approximation Techniques in Computer Science, pages 159–170. Springer
work page 1998
-
[24]
Rubin, D. B. (1981). The bayesian bootstrap.The Annals of Statistics, 9(1):130–134
work page 1981
-
[25]
Tang, Z. and Westling, T. (2024). Consistency of the bootstrap for asymptotically linear estimators based on machine learning.arXiv preprint arXiv:2404.03064
-
[26]
Wellner, J. A. and Zhan, Y. (1996). Bootstrapping Z-estimators.University of Washington Depart- ment of Statistics Technical Report, 308(5)
work page 1996
-
[27]
Wu, C.-F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis.the Annals of Statistics, 14(4):1261–1295. 30
work page 1986
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.