pith. sign in

arxiv: 2102.12034 · v1 · pith:JH36PSG6new · submitted 2021-02-24 · 📊 stat.ME · math.ST· stat.TH

Semiparametric counterfactual density estimation

classification 📊 stat.ME math.STstat.TH
keywords densitymodelscounterfactualdistancesgenericmodelresultsavailable
0
0 comments X
read the original abstract

Causal effects are often characterized with averages, which can give an incomplete picture of the underlying counterfactual distributions. Here we consider estimating the entire counterfactual density and generic functionals thereof. We focus on two kinds of target parameters. The first is a density approximation, defined by a projection onto a finite-dimensional model using a generalized distance metric, which includes f-divergences as well as $L_p$ norms. The second is the distance between counterfactual densities, which can be used as a more nuanced effect measure than the mean difference, and as a tool for model selection. We study nonparametric efficiency bounds for these targets, giving results for smooth but otherwise generic models and distances. Importantly, we show how these bounds connect to means of particular non-trivial functions of counterfactuals, linking the problems of density and mean estimation. We go on to propose doubly robust-style estimators for the density approximations and distances, and study their rates of convergence, showing they can be optimally efficient in large nonparametric models. We also give analogous methods for model selection and aggregation, when many models may be available and of interest. Our results all hold for generic models and distances, but throughout we highlight what happens for particular choices, such as $L_2$ projections on linear models, and KL projections on exponential families. Finally we illustrate by estimating the density of CD4 count among patients with HIV, had all been treated with combination therapy versus zidovudine alone, as well as a density effect. Our results suggest combination therapy may have increased CD4 count most for high-risk patients. Our methods are implemented in the freely available R package npcausal on GitHub.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Evaluating causal indirect effects when mediators are left-censored by assay limit of quantification

    stat.ME 2026-05 unverdicted novelty 7.0

    A semi-parametric framework using fractional imputation and EM algorithm for estimating causal direct and indirect effects with left-censored mediators due to assay limits.

  2. Evaluating causal indirect effects when mediators are left-censored by assay limit of quantification

    stat.ME 2026-05 unverdicted novelty 6.0

    Proposes a fractional imputation plus semi-parametric EM framework for estimating natural direct and indirect effects under deterministic left-censoring of the mediator by assay limit of quantification.

  3. A Semi-Supervised Kernel Two-Sample Test

    stat.ML 2026-05 unverdicted novelty 6.0

    A semi-supervised kernel two-sample test integrates unlabeled covariate data to achieve asymptotic normality under the null, higher power than standard kernel tests, and consistency against fixed and local alternatives.