On the Asymptotic Inadmissibility of Double Machine Learning Estimators Under Structure-Agnostic Models
Pith reviewed 2026-06-26 09:50 UTC · model grok-4.3
The pith
DML estimators are asymptotically inadmissible for the quadratic functional and quadratic density integral under structure-agnostic models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under structure-agnostic models the DML estimator for the quadratic functional and the quadratic density integral functional belongs to the monotone bias class and is asymptotically dominated by second-order empirical higher-order influence function estimators, which are U-statistics. This establishes the asymptotic inadmissibility of DML for these functionals. For the expected conditional covariance functional the HOIF estimator is also minimax but neither estimator asymptotically dominates the other.
What carries the argument
The monotone bias class of functionals, for which second-order empirical HOIF U-statistic estimators asymptotically dominate DML under SA models.
If this is right
- Second-order U-statistic estimators improve upon DML for every functional placed in the monotone bias class.
- Minimaxity of DML does not imply asymptotic admissibility under SA models for the quadratic and density-integral functionals.
- For the expected conditional covariance both the DML and HOIF estimators remain minimax without dominance between them.
Where Pith is reading between the lines
- Practitioners using black-box nuisance estimates for quadratic functionals may obtain lower risk by adding the higher-order terms of the HOIF construction.
- The SA model framework separates the question of minimax rate from the question of admissibility, suggesting similar gaps may exist for other semiparametric problems.
- It remains open which additional functionals fall into the monotone bias class and therefore admit the same dominance result.
Load-bearing premise
The structure-agnostic model neighborhood around the fixed machine learning estimates correctly represents the possible data-generating laws for the asymptotic comparison.
What would settle it
A concrete sequence of distributions inside the SA model neighborhood for which the asymptotic risk of the DML estimator is strictly smaller than the risk of the corresponding HOIF estimator on the quadratic functional.
read the original abstract
Structure-agnostic (SA) models introduced by Balakrishnan et al. (2026) aim to reflect the general lack of knowledge of structural assumptions on data-generating laws such as smoothness or sparsity in practice. Roughly speaking, SA models restrict the observed-data generating law to be in some rn-neighborhood of (black-box machine learning) estimates, treated as given and fixed, where rn encodes the convergence rates of the estimates to the truth. Under SA models, Balakrishnan et al. (2026) show that the popular Double Machine Learning (DML) estimators for three functionals, the quadratic functional in the Gaussian sequence model, the quadratic density integral functional and the expected conditional covariance, are minimax. However, minimax estimators may be inadmissible. In this paper, we show that, for the first two of the three functionals, the DML estimator is asymptotically inadmissible under the SA model. In particular, we show that these two functionals fall into a class of functionals, which we refer to as the monotone bias class. For this class, we exhibit second-order (U-statistic) estimators, which asymptotically dominate DML estimators, under the SA model. These second-order estimators are empirical higher-order influence function (HOIF) estimators introduced in Liu et al. (2017). Furthermore, the empirical HOIF estimator, like the DML estimator, is minimax for the third functional (the expected conditional covariance), although neither asymptotically dominates the other.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that under structure-agnostic (SA) models, the double machine learning (DML) estimators for the quadratic functional in the Gaussian sequence model and the quadratic density integral functional are asymptotically inadmissible. These two functionals belong to a newly defined 'monotone bias class,' for which empirical higher-order influence function (HOIF) U-statistics asymptotically dominate DML. For the expected conditional covariance functional, DML is minimax but neither it nor the HOIF estimator asymptotically dominates the other. The SA model treats black-box ML estimates as fixed, with the data law restricted to an rn-neighborhood around them.
Significance. If the domination result holds, the paper shows that minimaxity under SA models does not imply admissibility and identifies a concrete class of functionals where second-order U-statistics improve upon DML. It builds directly on the SA framework of Balakrishnan et al. (2026) and the HOIF estimators of Liu et al. (2017), with the monotone bias class providing a reusable conceptual tool. This is a meaningful contribution to understanding estimator choice when structural assumptions are absent.
major comments (1)
- [Proof of domination for the monotone bias class (risk calculation)] The load-bearing step is the asymptotic risk comparison establishing domination by the empirical HOIF U-statistic. This comparison must be carried out with the ML estimates held fixed (as required by the SA model definition) and the data-generating law restricted to the rn-neighborhood; any implicit treatment of the estimates as random or any extra regularity imposed on the neighborhood would prevent the inadmissibility conclusion from following from the SA model alone.
minor comments (2)
- [Introduction] The introduction would benefit from a short explicit reminder of how the rn-neighborhood is formally defined in Balakrishnan et al. (2026), to make the fixed-estimate restriction immediately visible to readers.
- [Section 2] Notation for the three functionals could be standardized in a single display early in the paper for easier cross-reference.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for highlighting the importance of strict adherence to the structure-agnostic (SA) model in the risk calculations. We address the single major comment below.
read point-by-point responses
-
Referee: [Proof of domination for the monotone bias class (risk calculation)] The load-bearing step is the asymptotic risk comparison establishing domination by the empirical HOIF U-statistic. This comparison must be carried out with the ML estimates held fixed (as required by the SA model definition) and the data-generating law restricted to the rn-neighborhood; any implicit treatment of the estimates as random or any extra regularity imposed on the neighborhood would prevent the inadmissibility conclusion from following from the SA model alone.
Authors: We agree that the risk comparison must be performed strictly within the SA model. In the proofs for the monotone bias class (Sections 3 and 4), the machine learning estimates are treated as fixed quantities, and all expectations and risk calculations are taken with respect to data-generating laws lying in the rn-neighborhood of these fixed estimates. No randomness is assigned to the estimates themselves, and no regularity conditions beyond membership in the rn-neighborhood are imposed. The asymptotic expansions for the risks of the DML and empirical HOIF estimators are derived directly from this restricted class of laws, yielding the claimed domination result under the SA model alone. revision: no
Circularity Check
Minor self-citation to HOIF definition; central inadmissibility result independent of fitted inputs or self-referential constructions
full rationale
The derivation relies on Balakrishnan et al. (2026) for the SA model definition and DML minimaxity, and on Liu et al. (2017) solely for the definition of the empirical HOIF estimators. The paper's new content—placing the quadratic functional and quadratic density integral into the monotone bias class and proving asymptotic domination by the U-statistic under the fixed-estimate rn-neighborhood—is presented as an independent comparison. No equation reduces a claimed prediction to a fitted parameter by construction, and the self-citation is not load-bearing for the inadmissibility statement itself. The result is therefore self-contained against the external benchmarks cited.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Structure-agnostic models restrict the observed-data generating law to rn-neighborhoods around fixed black-box ML estimates
invented entities (1)
-
monotone bias class
no independent evidence
Reference graph
Works this paper leans on
-
[1]
The gap between theory and practice in function approximation with deep neural networks
Ben Adcock and Nick Dexter. The gap between theory and practice in function approximation with deep neural networks. SIAM Journal on Mathematics of Data Science, 3 0 (2): 0 624--655, 2021
2021
-
[2]
How to sample and when to stop sampling: The generalized W ald problem and minimax policies
Karun Adusumilli. How to sample and when to stop sampling: The generalized W ald problem and minimax policies. Review of Economic Studies, 93 0 (1): 0 1--34, 2026
2026
-
[3]
Efficient estimation of models with conditional moment restrictions containing unknown functions
Chunrong Ai and Xiaohong Chen. Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica, 71 0 (6): 0 1795--1843, 2003
2003
-
[4]
A model of scientific communication
Isaiah Andrews and Jesse M Shapiro. A model of scientific communication. Econometrica, 89 0 (5): 0 2117--2142, 2021
2021
-
[5]
The fundamental limits of structure-agnostic functional estimation
Sivaraman Balakrishnan, Edward H Kennedy, and Larry Wasserman. The fundamental limits of structure-agnostic functional estimation. Statistical Science (To Appear), 2026
2026
-
[6]
Doubly-robust inference and optimality in structure-agnostic models with smoothness
Matteo Bonvini, Edward H Kennedy, Oliver Dukes, and Sivaraman Balakrishnan. Doubly-robust inference and optimality in structure-agnostic models with smoothness. arXiv preprint arXiv:2405.08525, 2024
arXiv 2024
-
[7]
Adaptive, rate-optimal hypothesis testing in nonparametric IV models
Christoph Breunig and Xiaohong Chen. Adaptive, rate-optimal hypothesis testing in nonparametric IV models. Econometrica, 92 0 (6): 0 2027--2067, 2024
2027
-
[8]
Admissible estimators, recurrent diffusions, and insoluble boundary value problems
Lawrence D Brown. Admissible estimators, recurrent diffusions, and insoluble boundary value problems. The Annals of Mathematical Statistics, 42 0 (3): 0 855--903, 1971
1971
-
[9]
Minimaxity, more or less
Lawrence D Brown. Minimaxity, more or less. In Statistical Decision Theory and Related Topics V, pages 1--18. Springer, 1994
1994
-
[10]
Large sample sieve estimation of semi-nonparametric models
Xiaohong Chen. Large sample sieve estimation of semi-nonparametric models. Handbook of Econometrics, 6 0 (Part B): 0 5549--5632, 2007
2007
-
[11]
Causal inference of general treatment effects using neural networks with a diverging number of confounders
Xiaohong Chen, Ying Liu, Shujie Ma, and Zheng Zhang. Causal inference of general treatment effects using neural networks with a diverging number of confounders. Journal of Econometrics, 238 0 (1): 0 105555, 2024
2024
-
[12]
Double/debiased machine learning for treatment and structural parameters
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21 0 (1): 0 C1--C68, 2018
2018
-
[13]
Open problem: Structure-agnostic minimax risk for partial linear model
Yihong Gu. Open problem: Structure-agnostic minimax risk for partial linear model. In The Thirty Eighth Annual Conference on Learning Theory, pages 6220--6224. PMLR, 2025
2025
-
[14]
Optimally taming biases in black-box models for efficient semiparametric estimation
Yihong Gu, Qishuo Yin, Tianxi Cai, and Jianqing Fan. Optimally taming biases in black-box models for efficient semiparametric estimation. arXiv preprint arXiv:2606.06368, 2026
Pith/arXiv arXiv 2026
-
[15]
Estimation with quadratic loss
William James and Charles Stein. Estimation with quadratic loss. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press Berkeley, 1961
1961
-
[16]
Sharp structure-agnostic lower bounds for general functional estimation
Jikai Jin and Vasilis Syrgkanis. Sharp structure-agnostic lower bounds for general functional estimation. arXiv preprint arXiv:2512.17341, 2025 a
arXiv 2025
-
[17]
Structure-agnostic optimality of doubly robust learning for treatment effect estimation
Jikai Jin and Vasilis Syrgkanis. Structure-agnostic optimality of doubly robust learning for treatment effect estimation. In Nika Haghtalab and Ankur Moitra, editors, Proceedings of Thirty Eighth Conference on Learning Theory, volume 291, pages 3159--3160. PMLR, 2025 b
2025
-
[18]
It's hard to be normal: The impact of noise on structure-agnostic estimation
Jikai Jin, Lester Mackey, and Vasilis Syrgkanis. It's hard to be normal: The impact of noise on structure-agnostic estimation. In Proceedings of The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
2025
-
[19]
Discussion of `` O n nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning''
Edward H Kennedy, Sivaraman Balakrishnan, and Larry Wasserman. Discussion of `` O n nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning''. Statistical Science, 35 0 (3): 0 540--544, 2020
2020
-
[20]
Semiparametric efficient empirical higher order influence function estimators
Lin Liu, Rajarshi Mukherjee, Whitney K Newey, and James M Robins. Semiparametric efficient empirical higher order influence function estimators. arXiv preprint arXiv:1705.07577, 2017
arXiv 2017
-
[21]
On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning
Lin Liu, Rajarshi Mukherjee, and James M Robins. On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning. Statistical Science, 35 0 (3): 0 518--539, 2020 a
2020
-
[22]
Rejoinder: On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning
Lin Liu, Rajarshi Mukherjee, and James M Robins. Rejoinder: On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning. Statistical Science, 35 0 (3): 0 545--554, 2020 b
2020
-
[23]
Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators
Lin Liu, Rajarshi Mukherjee, and James M Robins. Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators. Journal of Econometrics, 240 0 (2): 0 105500, 2024
2024
-
[24]
Double cross-fit doubly robust estimators: Beyond series regression
Alec McClean, Sivaraman Balakrishnan, Edward H Kennedy, and Larry Wasserman. Double cross-fit doubly robust estimators: Beyond series regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2026
2026
-
[25]
Nuisance function tuning and sample splitting for optimal doubly robust estimation
Sean McGrath and Rajarshi Mukherjee. Nuisance function tuning and sample splitting for optimal doubly robust estimation. arXiv preprint arXiv:2212.14857, 2026
arXiv 2026
-
[26]
Semiparametric efficiency bounds
Whitney K Newey. Semiparametric efficiency bounds. Journal of Applied Econometrics, 5 0 (2): 0 99--135, 1990
1990
-
[27]
Cross-fitting and fast remainder rates for semiparametric estimation
Whitney K Newey and James M Robins. Cross-fitting and fast remainder rates for semiparametric estimation. arXiv preprint arXiv:1801.09138, 2018
Pith/arXiv arXiv 2018
-
[28]
Achieving information bounds in non and semiparametric models
Ya'acov Ritov and Peter J Bickel. Achieving information bounds in non and semiparametric models. The Annals of Statistics, 18 0 (2): 0 925--938, 1990
1990
-
[29]
The B ayesian analysis of complex, high-dimensional models: Can it be CODA ? Statistical Science, 29 0 (4): 0 619--639, 2014
Ya'acov Ritov, Peter J Bickel, Anthony C Gamst, and Bastiaan Jan Korneel Kleijn. The B ayesian analysis of complex, high-dimensional models: Can it be CODA ? Statistical Science, 29 0 (4): 0 619--639, 2014
2014
-
[30]
Adaptive nonparametric confidence sets
James Robins and Aad van der Vaart . Adaptive nonparametric confidence sets. The Annals of Statistics, 34 0 (1): 0 229--253, 2006
2006
-
[31]
Higher order influence functions and minimax estimation of nonlinear functionals
James Robins, Lingling Li, Eric Tchetgen Tchetgen, and Aad van der Vaart. Higher order influence functions and minimax estimation of nonlinear functionals. In Probability and Statistics: Essays in Honor of David A. Freedman, pages 335--421. Institute of Mathematical Statistics, 2008
2008
-
[32]
Technical report: Higher order influence functions and minimax estimation of nonlinear functionals
James Robins, Lingling Li, Eric Tchetgen Tchetgen, and Aad van der Vaart. Technical report: Higher order influence functions and minimax estimation of nonlinear functionals. arXiv preprint arXiv:1601.05820, 2016
Pith/arXiv arXiv 2016
-
[33]
Toward a curse of dimensionality appropriate ( CODA ) asymptotic theory for semi-parametric models
James M Robins and Ya'acov Ritov. Toward a curse of dimensionality appropriate ( CODA ) asymptotic theory for semi-parametric models. Statistics in Medicine, 16 0 (3): 0 285--319, 1997
1997
-
[34]
Characterization of parameters with a mixed bias property
Andrea Rotnitzky, Ezequiel Smucler, and James M Robins. Characterization of parameters with a mixed bias property. Biometrika, 108 0 (1): 0 231--238, 2021
2021
-
[35]
Adjusting for nonignorable drop-out using semiparametric nonresponse models
Daniel O Scharfstein, Andrea Rotnitzky, and James M Robins. Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association, 94 0 (448): 0 1096--1120, 1999
1999
-
[36]
What, why, and how: An empiricist's guide to double/debiased machine learning
Bowen Shi, Xiaojie Mao, Mochen Yang, and Bo Li. What, why, and how: An empiricist's guide to double/debiased machine learning. Information Systems Research, 2026
2026
-
[37]
Inadmissibility of the usual estimator for the mean of a multivariate normal distribution
Charles Stein. Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press Berkeley, 1956
1956
-
[38]
Optimal rates of convergence for nonparametric estimators
Charles J Stone. Optimal rates of convergence for nonparametric estimators. The Annals of Statistics, 8 0 (6): 0 1348--1360, 1980
1980
-
[39]
Optimal global rates of convergence for nonparametric regression
Charles J Stone. Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10 0 (4): 0 1040--1053, 1982
1982
-
[40]
An introduction to matrix concentration inequalities
Joel A Tropp. An introduction to matrix concentration inequalities. Foundations and Trends in Machine Learning , 8 0 (1-2): 0 1--230, 2015
2015
-
[41]
On the principles of statistical inference
Abraham Wald. On the principles of statistical inference. University of Notre Dame, 1941
1941
-
[42]
Statistical decision functions which minimize the maximum risk
Abraham Wald. Statistical decision functions which minimize the maximum risk. Annals of Mathematics, 46 0 (2): 0 265--280, 1945
1945
-
[43]
An essentially complete class of admissible decision functions
Abraham Wald. An essentially complete class of admissible decision functions. The Annals of Mathematical Statistics, 18 0 (4): 0 549--555, 1947
1947
-
[44]
Deep M ed: Semiparametric causal mediation analysis with debiased deep learning
Siqi Xu, Lin Liu, and Zhonghua Liu. Deep M ed: Semiparametric causal mediation analysis with debiased deep learning. In Proceedings of the 36th International Conference on Neural Information Processing Systems, pages 28238--28251, 2022
2022
-
[45]
Perturbed double machine learning: Nonstandard inference beyond the parametric length
Mengchu Zheng, Matteo Bonvini, and Zijian Guo. Perturbed double machine learning: Nonstandard inference beyond the parametric length. arXiv preprint arXiv:2511.01222, 2025
arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.