Proposes a doubly cross-fit doubly robust machine learner for conditional principal causal effects under principal ignorability with odds ratio sensitivity, with limit theory and application to an acute lung injury trial.
hub
Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation
14 Pith papers cite this work. Polarity classification is still indexing.
abstract
There are many interesting and widely used estimators of a functional with finite semiparametric variance bound that depend on nonparametric estimators of nuisance functions. We use cross-fitting (i.e. sample splitting) to construct novel estimators with fast remainder rates. We give cross-fit doubly robust estimators that use separate subsamples to estimate different nuisance functions. We obtain general, precise results for regression spline estimation of average linear functionals of conditional expectations with a finite semiparametric variance bound. We show that a cross-fit doubly robust spline regression estimator of the expected conditional covariance is semiparametric efficient under minimal conditions. Cross-fit doubly robust estimators of other average linear functionals of a conditional expectation are shown to have the fastest known remainder rates for the Haar basis or under certain smoothness conditions. Surprisingly, the cross-fit plug-in estimator also has nearly the fastest known remainder rate, but the remainder converges to zero slower than the cross-fit doubly robust estimator. As specific examples we consider the expected conditional covariance, mean with randomly missing data, and a weighted average derivative.
hub tools
verdicts
UNVERDICTED 14representative citing papers
DML estimators for the quadratic functional and quadratic density integral are asymptotically inadmissible under SA models and dominated by empirical HOIF estimators, while DML remains minimax for expected conditional covariance.
Develops higher-order influence function estimators for implicitly defined parameters in non-separable structural models using U-processes theory.
A new framework enables conditional independence testing for single realizations of nonstationary nonlinear multivariate time series using time-varying nonlinear regression, local long-run covariance estimation, and distribution-uniform Gaussian approximation.
Causal k-Means Clustering applies k-means to estimated counterfactual functions via plug-in and double machine learning bias-corrected estimators to identify subgroups with heterogeneous treatment effects and achieves root-n rates.
The Sinkhorn treatment effect is a new entropic optimal transport measure of divergence between counterfactual distributions that admits first- and second-order pathwise differentiability, debiased estimators, and asymptotically valid tests for distributional treatment effects.
A conditional adaptive perturbation approach enables valid in-sample inference for machine learning-identified subgroups with nonregular boundaries via triple robustness.
Proposes an augmented weighted estimator via kernel functional balancing over a joint RKHS for causal inference with compositional treatments, claiming sqrt(n)-consistency and asymptotic normality around a sample-specific target.
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
The IF-LOO variance estimator for covariate-adjusted treatment effects with binary outcomes provides appropriate type I error control in simulations, especially for rare events or small samples, with a closed-form implementation.
UD-DML creates balanced representative subsamples via uniform design in PCA space for efficient double machine learning estimation of average treatment effects on large datasets.
A semi-supervised kernel two-sample test integrates unlabeled covariate data to achieve asymptotic normality under the null, higher power than standard kernel tests, and consistency against fixed and local alternatives.
crossfit is an R package that supplies a general-purpose cross-fitting engine driven by user-specified DAGs of nuisance models with configurable fold allocations and reproducibility features.
PEQ-Net uses policy-aware reparameterization of ICE Q-functions and kernel mean embeddings in a shared encoder, followed by LTMLE, to jointly estimate multiple policies while constraining second-order bias for lower variance.
citing papers explorer
-
Learning heterogeneous treatment effects under principal stratification
Proposes a doubly cross-fit doubly robust machine learner for conditional principal causal effects under principal ignorability with odds ratio sensitivity, with limit theory and application to an acute lung injury trial.
-
In-Sample Evaluation of Subgroups Identified by Generic Machine Learning
A conditional adaptive perturbation approach enables valid in-sample inference for machine learning-identified subgroups with nonregular boundaries via triple robustness.
-
Kernel-Based Functional Balancing for Causal Inference with Compositional Treatments
Proposes an augmented weighted estimator via kernel functional balancing over a joint RKHS for causal inference with compositional treatments, claiming sqrt(n)-consistency and asymptotic normality around a sample-specific target.
-
Improving Variance Estimation for Covariate Adjustment with Binary Outcomes
The IF-LOO variance estimator for covariate-adjusted treatment effects with binary outcomes provides appropriate type I error control in simulations, especially for rare events or small samples, with a closed-form implementation.
-
UD-DML: Uniform Design Subsampling for Double Machine Learning over Massive Data
UD-DML creates balanced representative subsamples via uniform design in PCA space for efficient double machine learning estimation of average treatment effects on large datasets.