pith. sign in

arxiv: 2606.22391 · v1 · pith:XBG5KPX3new · submitted 2026-06-21 · 🧮 math.ST · econ.EM· stat.ML· stat.TH

On the Asymptotic Inadmissibility of Double Machine Learning Estimators Under Structure-Agnostic Models

Pith reviewed 2026-06-26 09:50 UTC · model grok-4.3

classification 🧮 math.ST econ.EMstat.MLstat.TH
keywords double machine learningstructure-agnostic modelsasymptotic inadmissibilityhigher-order influence functionsU-statisticsmonotone bias classminimax estimation
0
0 comments X

The pith

DML estimators are asymptotically inadmissible for the quadratic functional and quadratic density integral under structure-agnostic models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies Double Machine Learning estimators inside structure-agnostic models, which keep the data law inside an rn-neighborhood of fixed machine-learning estimates. Earlier results established that DML attains minimax rates for three functionals, yet the authors prove that DML is asymptotically inadmissible for the quadratic functional in the Gaussian sequence model and the quadratic density integral functional. They place these two functionals inside a monotone bias class and construct second-order U-statistic estimators that dominate DML asymptotically. For the expected conditional covariance the higher-order estimator remains minimax but does not dominate DML.

Core claim

Under structure-agnostic models the DML estimator for the quadratic functional and the quadratic density integral functional belongs to the monotone bias class and is asymptotically dominated by second-order empirical higher-order influence function estimators, which are U-statistics. This establishes the asymptotic inadmissibility of DML for these functionals. For the expected conditional covariance functional the HOIF estimator is also minimax but neither estimator asymptotically dominates the other.

What carries the argument

The monotone bias class of functionals, for which second-order empirical HOIF U-statistic estimators asymptotically dominate DML under SA models.

If this is right

  • Second-order U-statistic estimators improve upon DML for every functional placed in the monotone bias class.
  • Minimaxity of DML does not imply asymptotic admissibility under SA models for the quadratic and density-integral functionals.
  • For the expected conditional covariance both the DML and HOIF estimators remain minimax without dominance between them.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners using black-box nuisance estimates for quadratic functionals may obtain lower risk by adding the higher-order terms of the HOIF construction.
  • The SA model framework separates the question of minimax rate from the question of admissibility, suggesting similar gaps may exist for other semiparametric problems.
  • It remains open which additional functionals fall into the monotone bias class and therefore admit the same dominance result.

Load-bearing premise

The structure-agnostic model neighborhood around the fixed machine learning estimates correctly represents the possible data-generating laws for the asymptotic comparison.

What would settle it

A concrete sequence of distributions inside the SA model neighborhood for which the asymptotic risk of the DML estimator is strictly smaller than the risk of the corresponding HOIF estimator on the quadratic functional.

read the original abstract

Structure-agnostic (SA) models introduced by Balakrishnan et al. (2026) aim to reflect the general lack of knowledge of structural assumptions on data-generating laws such as smoothness or sparsity in practice. Roughly speaking, SA models restrict the observed-data generating law to be in some rn-neighborhood of (black-box machine learning) estimates, treated as given and fixed, where rn encodes the convergence rates of the estimates to the truth. Under SA models, Balakrishnan et al. (2026) show that the popular Double Machine Learning (DML) estimators for three functionals, the quadratic functional in the Gaussian sequence model, the quadratic density integral functional and the expected conditional covariance, are minimax. However, minimax estimators may be inadmissible. In this paper, we show that, for the first two of the three functionals, the DML estimator is asymptotically inadmissible under the SA model. In particular, we show that these two functionals fall into a class of functionals, which we refer to as the monotone bias class. For this class, we exhibit second-order (U-statistic) estimators, which asymptotically dominate DML estimators, under the SA model. These second-order estimators are empirical higher-order influence function (HOIF) estimators introduced in Liu et al. (2017). Furthermore, the empirical HOIF estimator, like the DML estimator, is minimax for the third functional (the expected conditional covariance), although neither asymptotically dominates the other.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper claims that under structure-agnostic (SA) models, the double machine learning (DML) estimators for the quadratic functional in the Gaussian sequence model and the quadratic density integral functional are asymptotically inadmissible. These two functionals belong to a newly defined 'monotone bias class,' for which empirical higher-order influence function (HOIF) U-statistics asymptotically dominate DML. For the expected conditional covariance functional, DML is minimax but neither it nor the HOIF estimator asymptotically dominates the other. The SA model treats black-box ML estimates as fixed, with the data law restricted to an rn-neighborhood around them.

Significance. If the domination result holds, the paper shows that minimaxity under SA models does not imply admissibility and identifies a concrete class of functionals where second-order U-statistics improve upon DML. It builds directly on the SA framework of Balakrishnan et al. (2026) and the HOIF estimators of Liu et al. (2017), with the monotone bias class providing a reusable conceptual tool. This is a meaningful contribution to understanding estimator choice when structural assumptions are absent.

major comments (1)
  1. [Proof of domination for the monotone bias class (risk calculation)] The load-bearing step is the asymptotic risk comparison establishing domination by the empirical HOIF U-statistic. This comparison must be carried out with the ML estimates held fixed (as required by the SA model definition) and the data-generating law restricted to the rn-neighborhood; any implicit treatment of the estimates as random or any extra regularity imposed on the neighborhood would prevent the inadmissibility conclusion from following from the SA model alone.
minor comments (2)
  1. [Introduction] The introduction would benefit from a short explicit reminder of how the rn-neighborhood is formally defined in Balakrishnan et al. (2026), to make the fixed-estimate restriction immediately visible to readers.
  2. [Section 2] Notation for the three functionals could be standardized in a single display early in the paper for easier cross-reference.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for highlighting the importance of strict adherence to the structure-agnostic (SA) model in the risk calculations. We address the single major comment below.

read point-by-point responses
  1. Referee: [Proof of domination for the monotone bias class (risk calculation)] The load-bearing step is the asymptotic risk comparison establishing domination by the empirical HOIF U-statistic. This comparison must be carried out with the ML estimates held fixed (as required by the SA model definition) and the data-generating law restricted to the rn-neighborhood; any implicit treatment of the estimates as random or any extra regularity imposed on the neighborhood would prevent the inadmissibility conclusion from following from the SA model alone.

    Authors: We agree that the risk comparison must be performed strictly within the SA model. In the proofs for the monotone bias class (Sections 3 and 4), the machine learning estimates are treated as fixed quantities, and all expectations and risk calculations are taken with respect to data-generating laws lying in the rn-neighborhood of these fixed estimates. No randomness is assigned to the estimates themselves, and no regularity conditions beyond membership in the rn-neighborhood are imposed. The asymptotic expansions for the risks of the DML and empirical HOIF estimators are derived directly from this restricted class of laws, yielding the claimed domination result under the SA model alone. revision: no

Circularity Check

0 steps flagged

Minor self-citation to HOIF definition; central inadmissibility result independent of fitted inputs or self-referential constructions

full rationale

The derivation relies on Balakrishnan et al. (2026) for the SA model definition and DML minimaxity, and on Liu et al. (2017) solely for the definition of the empirical HOIF estimators. The paper's new content—placing the quadratic functional and quadratic density integral into the monotone bias class and proving asymptotic domination by the U-statistic under the fixed-estimate rn-neighborhood—is presented as an independent comparison. No equation reduces a claimed prediction to a fitted parameter by construction, and the self-citation is not load-bearing for the inadmissibility statement itself. The result is therefore self-contained against the external benchmarks cited.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on the structure-agnostic model framework from prior work and the properties of the newly referenced monotone bias class of functionals.

axioms (1)
  • domain assumption Structure-agnostic models restrict the observed-data generating law to rn-neighborhoods around fixed black-box ML estimates
    This is the foundational modeling choice that enables the inadmissibility analysis.
invented entities (1)
  • monotone bias class no independent evidence
    purpose: Class of functionals for which second-order U-statistic estimators asymptotically dominate DML under SA models
    Introduced to organize the domination result for the two quadratic functionals.

pith-pipeline@v0.9.1-grok · 5811 in / 1380 out tokens · 32657 ms · 2026-06-26T09:50:07.476997+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 3 linked inside Pith

  1. [1]

    The gap between theory and practice in function approximation with deep neural networks

    Ben Adcock and Nick Dexter. The gap between theory and practice in function approximation with deep neural networks. SIAM Journal on Mathematics of Data Science, 3 0 (2): 0 624--655, 2021

  2. [2]

    How to sample and when to stop sampling: The generalized W ald problem and minimax policies

    Karun Adusumilli. How to sample and when to stop sampling: The generalized W ald problem and minimax policies. Review of Economic Studies, 93 0 (1): 0 1--34, 2026

  3. [3]

    Efficient estimation of models with conditional moment restrictions containing unknown functions

    Chunrong Ai and Xiaohong Chen. Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica, 71 0 (6): 0 1795--1843, 2003

  4. [4]

    A model of scientific communication

    Isaiah Andrews and Jesse M Shapiro. A model of scientific communication. Econometrica, 89 0 (5): 0 2117--2142, 2021

  5. [5]

    The fundamental limits of structure-agnostic functional estimation

    Sivaraman Balakrishnan, Edward H Kennedy, and Larry Wasserman. The fundamental limits of structure-agnostic functional estimation. Statistical Science (To Appear), 2026

  6. [6]

    Doubly-robust inference and optimality in structure-agnostic models with smoothness

    Matteo Bonvini, Edward H Kennedy, Oliver Dukes, and Sivaraman Balakrishnan. Doubly-robust inference and optimality in structure-agnostic models with smoothness. arXiv preprint arXiv:2405.08525, 2024

  7. [7]

    Adaptive, rate-optimal hypothesis testing in nonparametric IV models

    Christoph Breunig and Xiaohong Chen. Adaptive, rate-optimal hypothesis testing in nonparametric IV models. Econometrica, 92 0 (6): 0 2027--2067, 2024

  8. [8]

    Admissible estimators, recurrent diffusions, and insoluble boundary value problems

    Lawrence D Brown. Admissible estimators, recurrent diffusions, and insoluble boundary value problems. The Annals of Mathematical Statistics, 42 0 (3): 0 855--903, 1971

  9. [9]

    Minimaxity, more or less

    Lawrence D Brown. Minimaxity, more or less. In Statistical Decision Theory and Related Topics V, pages 1--18. Springer, 1994

  10. [10]

    Large sample sieve estimation of semi-nonparametric models

    Xiaohong Chen. Large sample sieve estimation of semi-nonparametric models. Handbook of Econometrics, 6 0 (Part B): 0 5549--5632, 2007

  11. [11]

    Causal inference of general treatment effects using neural networks with a diverging number of confounders

    Xiaohong Chen, Ying Liu, Shujie Ma, and Zheng Zhang. Causal inference of general treatment effects using neural networks with a diverging number of confounders. Journal of Econometrics, 238 0 (1): 0 105555, 2024

  12. [12]

    Double/debiased machine learning for treatment and structural parameters

    Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21 0 (1): 0 C1--C68, 2018

  13. [13]

    Open problem: Structure-agnostic minimax risk for partial linear model

    Yihong Gu. Open problem: Structure-agnostic minimax risk for partial linear model. In The Thirty Eighth Annual Conference on Learning Theory, pages 6220--6224. PMLR, 2025

  14. [14]

    Optimally taming biases in black-box models for efficient semiparametric estimation

    Yihong Gu, Qishuo Yin, Tianxi Cai, and Jianqing Fan. Optimally taming biases in black-box models for efficient semiparametric estimation. arXiv preprint arXiv:2606.06368, 2026

  15. [15]

    Estimation with quadratic loss

    William James and Charles Stein. Estimation with quadratic loss. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press Berkeley, 1961

  16. [16]

    Sharp structure-agnostic lower bounds for general functional estimation

    Jikai Jin and Vasilis Syrgkanis. Sharp structure-agnostic lower bounds for general functional estimation. arXiv preprint arXiv:2512.17341, 2025 a

  17. [17]

    Structure-agnostic optimality of doubly robust learning for treatment effect estimation

    Jikai Jin and Vasilis Syrgkanis. Structure-agnostic optimality of doubly robust learning for treatment effect estimation. In Nika Haghtalab and Ankur Moitra, editors, Proceedings of Thirty Eighth Conference on Learning Theory, volume 291, pages 3159--3160. PMLR, 2025 b

  18. [18]

    It's hard to be normal: The impact of noise on structure-agnostic estimation

    Jikai Jin, Lester Mackey, and Vasilis Syrgkanis. It's hard to be normal: The impact of noise on structure-agnostic estimation. In Proceedings of The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  19. [19]

    Discussion of `` O n nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning''

    Edward H Kennedy, Sivaraman Balakrishnan, and Larry Wasserman. Discussion of `` O n nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning''. Statistical Science, 35 0 (3): 0 540--544, 2020

  20. [20]

    Semiparametric efficient empirical higher order influence function estimators

    Lin Liu, Rajarshi Mukherjee, Whitney K Newey, and James M Robins. Semiparametric efficient empirical higher order influence function estimators. arXiv preprint arXiv:1705.07577, 2017

  21. [21]

    On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning

    Lin Liu, Rajarshi Mukherjee, and James M Robins. On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning. Statistical Science, 35 0 (3): 0 518--539, 2020 a

  22. [22]

    Rejoinder: On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning

    Lin Liu, Rajarshi Mukherjee, and James M Robins. Rejoinder: On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning. Statistical Science, 35 0 (3): 0 545--554, 2020 b

  23. [23]

    Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators

    Lin Liu, Rajarshi Mukherjee, and James M Robins. Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators. Journal of Econometrics, 240 0 (2): 0 105500, 2024

  24. [24]

    Double cross-fit doubly robust estimators: Beyond series regression

    Alec McClean, Sivaraman Balakrishnan, Edward H Kennedy, and Larry Wasserman. Double cross-fit doubly robust estimators: Beyond series regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2026

  25. [25]

    Nuisance function tuning and sample splitting for optimal doubly robust estimation

    Sean McGrath and Rajarshi Mukherjee. Nuisance function tuning and sample splitting for optimal doubly robust estimation. arXiv preprint arXiv:2212.14857, 2026

  26. [26]

    Semiparametric efficiency bounds

    Whitney K Newey. Semiparametric efficiency bounds. Journal of Applied Econometrics, 5 0 (2): 0 99--135, 1990

  27. [27]

    Cross-fitting and fast remainder rates for semiparametric estimation

    Whitney K Newey and James M Robins. Cross-fitting and fast remainder rates for semiparametric estimation. arXiv preprint arXiv:1801.09138, 2018

  28. [28]

    Achieving information bounds in non and semiparametric models

    Ya'acov Ritov and Peter J Bickel. Achieving information bounds in non and semiparametric models. The Annals of Statistics, 18 0 (2): 0 925--938, 1990

  29. [29]

    The B ayesian analysis of complex, high-dimensional models: Can it be CODA ? Statistical Science, 29 0 (4): 0 619--639, 2014

    Ya'acov Ritov, Peter J Bickel, Anthony C Gamst, and Bastiaan Jan Korneel Kleijn. The B ayesian analysis of complex, high-dimensional models: Can it be CODA ? Statistical Science, 29 0 (4): 0 619--639, 2014

  30. [30]

    Adaptive nonparametric confidence sets

    James Robins and Aad van der Vaart . Adaptive nonparametric confidence sets. The Annals of Statistics, 34 0 (1): 0 229--253, 2006

  31. [31]

    Higher order influence functions and minimax estimation of nonlinear functionals

    James Robins, Lingling Li, Eric Tchetgen Tchetgen, and Aad van der Vaart. Higher order influence functions and minimax estimation of nonlinear functionals. In Probability and Statistics: Essays in Honor of David A. Freedman, pages 335--421. Institute of Mathematical Statistics, 2008

  32. [32]

    Technical report: Higher order influence functions and minimax estimation of nonlinear functionals

    James Robins, Lingling Li, Eric Tchetgen Tchetgen, and Aad van der Vaart. Technical report: Higher order influence functions and minimax estimation of nonlinear functionals. arXiv preprint arXiv:1601.05820, 2016

  33. [33]

    Toward a curse of dimensionality appropriate ( CODA ) asymptotic theory for semi-parametric models

    James M Robins and Ya'acov Ritov. Toward a curse of dimensionality appropriate ( CODA ) asymptotic theory for semi-parametric models. Statistics in Medicine, 16 0 (3): 0 285--319, 1997

  34. [34]

    Characterization of parameters with a mixed bias property

    Andrea Rotnitzky, Ezequiel Smucler, and James M Robins. Characterization of parameters with a mixed bias property. Biometrika, 108 0 (1): 0 231--238, 2021

  35. [35]

    Adjusting for nonignorable drop-out using semiparametric nonresponse models

    Daniel O Scharfstein, Andrea Rotnitzky, and James M Robins. Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association, 94 0 (448): 0 1096--1120, 1999

  36. [36]

    What, why, and how: An empiricist's guide to double/debiased machine learning

    Bowen Shi, Xiaojie Mao, Mochen Yang, and Bo Li. What, why, and how: An empiricist's guide to double/debiased machine learning. Information Systems Research, 2026

  37. [37]

    Inadmissibility of the usual estimator for the mean of a multivariate normal distribution

    Charles Stein. Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press Berkeley, 1956

  38. [38]

    Optimal rates of convergence for nonparametric estimators

    Charles J Stone. Optimal rates of convergence for nonparametric estimators. The Annals of Statistics, 8 0 (6): 0 1348--1360, 1980

  39. [39]

    Optimal global rates of convergence for nonparametric regression

    Charles J Stone. Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10 0 (4): 0 1040--1053, 1982

  40. [40]

    An introduction to matrix concentration inequalities

    Joel A Tropp. An introduction to matrix concentration inequalities. Foundations and Trends in Machine Learning , 8 0 (1-2): 0 1--230, 2015

  41. [41]

    On the principles of statistical inference

    Abraham Wald. On the principles of statistical inference. University of Notre Dame, 1941

  42. [42]

    Statistical decision functions which minimize the maximum risk

    Abraham Wald. Statistical decision functions which minimize the maximum risk. Annals of Mathematics, 46 0 (2): 0 265--280, 1945

  43. [43]

    An essentially complete class of admissible decision functions

    Abraham Wald. An essentially complete class of admissible decision functions. The Annals of Mathematical Statistics, 18 0 (4): 0 549--555, 1947

  44. [44]

    Deep M ed: Semiparametric causal mediation analysis with debiased deep learning

    Siqi Xu, Lin Liu, and Zhonghua Liu. Deep M ed: Semiparametric causal mediation analysis with debiased deep learning. In Proceedings of the 36th International Conference on Neural Information Processing Systems, pages 28238--28251, 2022

  45. [45]

    Perturbed double machine learning: Nonstandard inference beyond the parametric length

    Mengchu Zheng, Matteo Bonvini, and Zijian Guo. Perturbed double machine learning: Nonstandard inference beyond the parametric length. arXiv preprint arXiv:2511.01222, 2025