Flexible Nonparametric Inference for Causal Effects under the Front-Door Model

Anna Guo; David Benkeser; Razieh Nabi

arxiv: 2312.10234 · v3 · submitted 2023-12-15 · 📊 stat.ME · stat.ML

Flexible Nonparametric Inference for Causal Effects under the Front-Door Model

Anna Guo , David Benkeser , Razieh Nabi This is my paper

Pith reviewed 2026-05-24 05:17 UTC · model grok-4.3

classification 📊 stat.ME stat.ML

keywords front-door criterioncausal inferenceaverage treatment effecttargeted minimum loss estimationnonparametric estimationmachine learningsemiparametric modelsidentification tests

0 comments

The pith

One-step and targeted estimators recover average treatment effects under front-door assumptions using machine learning nuisances.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops one-step and targeted minimum loss-based estimators for both the average treatment effect and the average treatment effect on the treated when identification relies on the front-door criterion. These estimators work from multiple parameterizations of the observed data distribution, including versions that skip modeling the mediator density, and integrate with flexible machine learning for the nuisance functions. The authors derive second-order remainder bounds that deliver root-n consistency and asymptotic linearity. They also supply tests for the identification assumptions inside a semiparametric extension that encodes generalized independence constraints and show how those constraints can raise efficiency.

Core claim

Under front-door assumptions, novel one-step and targeted minimum loss-based estimators for the average treatment effect and the average treatment effect on the treated can be built from multiple observed-data parameterizations, some of which avoid modeling the mediator density entirely. The estimators remain compatible with machine-learning nuisance estimation. Root-n consistency and asymptotic linearity are obtained once second-order remainder terms are controlled. The same framework yields doubly robust tests for the identification assumptions inside a semiparametric model that encodes generalized Verma constraints, and those constraints can be exploited to improve estimator efficiency.

What carries the argument

One-step and targeted minimum loss-based estimators constructed from multiple parameterizations of the observed data law under the front-door model, together with second-order remainder bounds that guarantee asymptotic linearity.

If this is right

Root-n consistency and asymptotic linearity hold once the second-order remainder terms vanish at the required rate.
Doubly robust tests can assess the front-door identification assumptions inside the semiparametric extension.
Generalized independence constraints can be used to raise the efficiency of the causal-effect estimators.
The methods apply directly to real data in education and emergency-medicine settings with favorable finite-sample behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The mediator-density-free parameterization may reduce sensitivity when the mediator is high-dimensional or continuous.
The same remainder-bound technique could be adapted to other identification strategies that involve mediators.
Pairing the estimators with the doubly robust tests could produce a practical workflow for checking and then exploiting front-door assumptions in observational studies.
Efficiency gains from the independence constraints suggest the approach may scale to richer semiparametric causal models.

Load-bearing premise

The front-door assumptions must hold exactly: the mediator intercepts every directed path from treatment to outcome and shares no unmeasured confounders with the treatment-outcome pair.

What would settle it

A Monte Carlo experiment in which the front-door assumptions are satisfied by construction yet the proposed estimators fail to attain root-n rates once the second-order remainder bounds are violated at the stated rates would falsify the consistency result.

Figures

Figures reproduced from arXiv: 2312.10234 by Anna Guo, David Benkeser, Razieh Nabi.

**Figure 2.** Figure 2: Two variations of the front-door graph incorporating an anchor variable [PITH_FULL_IMAGE:figures/full_fig_p079_2.png] view at source ↗

**Figure 3.** Figure 3: (a) An example of an anchor-included front-door graph; (b) The conditional graph corresponding to [PITH_FULL_IMAGE:figures/full_fig_p080_3.png] view at source ↗

**Figure 4.** Figure 4: (a) Fixing M = m induces the independence Z ⊥ Y m | X in P(X, Z, A, Y m); (b) Fixing A = a induces the independence Z ⊥ Y a | X, Ma in P(X, Z, Ma , Y a ); (c) The graph corresponding to P(X, Z, A, Ma , Y a ). fixability [Bhattacharya et al., 2022]. A variable Oi ∈ O is said to be primal fixable if it does not have a path to any of its children that passes only through unmeasured variables. The identified f… view at source ↗

**Figure 5.** Figure 5: Simulation results validating the √ n-consistency behaviors of the ATE estimators, under univariate binary mediator: (left) TMLE; (right) one-step estimator. 96 [PITH_FULL_IMAGE:figures/full_fig_p096_5.png] view at source ↗

**Figure 6.** Figure 6: Simulation results validating the √ n-consistency behaviors of the ATE estimators, under univariate continuous mediator: (left) TMLEs; (right) one-step estimators. 97 [PITH_FULL_IMAGE:figures/full_fig_p097_6.png] view at source ↗

**Figure 7.** Figure 7: Simulation results validating the √ n-consistency behaviors of the ATE estimators, under bivariate continuous mediators: (left) TMLEs; (right) one-step estimators. 98 [PITH_FULL_IMAGE:figures/full_fig_p098_7.png] view at source ↗

**Figure 8.** Figure 8: Simulation results validating the √ n-consistency behaviors of the ATE estimators, under quadrivariate continuous mediators: (left) TMLEs; (right) one-step estimators. 99 [PITH_FULL_IMAGE:figures/full_fig_p099_8.png] view at source ↗

**Figure 9.** Figure 9: Simulation results validating the √ n-consistency behaviors of the ATT estimators, under univariate binary mediator: (left) TMLEs; (right) one-step estimators. 100 [PITH_FULL_IMAGE:figures/full_fig_p100_9.png] view at source ↗

**Figure 10.** Figure 10: Simulation results validating the √ n-consistency behaviors of the ATT estimators, under univariate continuous mediator: (left) TMLEs; (right) one-step estimators. 101 [PITH_FULL_IMAGE:figures/full_fig_p101_10.png] view at source ↗

**Figure 11.** Figure 11: Simulation results validating the √ n-consistency behaviors of the ATT estimators, under bivariate continuous mediators: : (left) TMLEs; (right) one-step estimators. 102 [PITH_FULL_IMAGE:figures/full_fig_p102_11.png] view at source ↗

**Figure 12.** Figure 12: Simulation results validating the √ n-consistency behaviors of the ATT estimators, under quadrivariate continuous mediators: : (left) TMLEs; (right) one-step estimators. 103 [PITH_FULL_IMAGE:figures/full_fig_p103_12.png] view at source ↗

**Figure 13.** Figure 13: DAGs used in simulations on model evaluations: DAG1 and DAG2 correspond to scenarios where [PITH_FULL_IMAGE:figures/full_fig_p111_13.png] view at source ↗

**Figure 14.** Figure 14: Simulation results demonstrating efficiency gains in ATE estimation when utilizing the Verma [PITH_FULL_IMAGE:figures/full_fig_p125_14.png] view at source ↗

**Figure 15.** Figure 15: Simulation results demonstrating efficiency gains in ATE estimation when utilizing the Verma [PITH_FULL_IMAGE:figures/full_fig_p126_15.png] view at source ↗

read the original abstract

Evaluating causal treatment effects in observational studies requires addressing confounding. While the back-door criterion enables identification through adjustment for observed covariates, it fails in the presence of unmeasured confounding. The front-door criterion offers an alternative by leveraging variables that fully mediate the treatment effect and are unaffected by unmeasured confounders of the treatment-outcome pair. We develop novel one-step and targeted minimum loss-based estimators for both the average treatment effect and the average treatment effect on the treated under front-door assumptions. Our estimators are built on multiple parameterizations of the observed data distribution, including approaches that avoid modeling the mediator density entirely, and are compatible with flexible, machine learning-based nuisance estimation. We establish conditions for root-n consistency and asymptotic linearity by deriving second-order remainder bounds. We also develop flexible tests for assessing identification assumptions, including a doubly robust testing procedure, within a semiparametric extension of the front-door model that encodes generalized (Verma) independence constraints. We further show how these constraints can be leveraged to improve the efficiency of causal effect estimators. Simulation studies confirm favorable finite-sample performance, and real-data applications in education and emergency medicine illustrate the practical utility of our methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper supplies one-step and TMLE estimators for ATE and ATT under front-door that use multiple observed-data parameterizations and skip mediator density modeling.

read the letter

The main point is that the authors give one-step and targeted minimum loss estimators for average treatment effects under the front-door criterion. These work with machine learning nuisances, come in several observed-data forms including ones that avoid the mediator density, and are paired with a doubly robust test for the identification constraints plus efficiency gains from those constraints. They also derive the second-order remainder bounds needed for root-n consistency and asymptotic linearity. The simulations and two applied examples are included to back the claims. What is actually new is the set of parameterizations and the testing procedure built on the extended front-door model with Verma constraints. The rest follows standard semiparametric efficiency arguments applied to this functional, which is handled cleanly. The paper does well by making the methods flexible and by showing concrete use in education and emergency medicine data. The technical conditions look standard and the avoidance of mediator modeling is a practical plus when that density is difficult. Soft spots are limited. The front-door assumptions themselves remain strong and must hold exactly, including the mediator fully intercepting the effect and being free of certain unmeasured confounding. The reported finite-sample results depend on the specific simulation setups, which would need checking in review, but the abstract gives no sign of problems there. The efficiency improvements from the extra constraints are likely modest unless the constraints are strong in the data. This work is aimed at causal inference researchers who need nonparametric tools when back-door adjustment is blocked by unmeasured confounding. A reader focused on front-door methods will get usable estimators and tests. The paper shows clear engagement with the literature and no internal contradictions, so it deserves a serious referee rather than a desk reject.

Referee Report

2 major / 3 minor

Summary. The paper develops novel one-step and targeted minimum loss-based (TMLE) estimators for the average treatment effect (ATE) and average treatment effect on the treated (ATT) under the front-door identification criterion. Estimators are constructed via multiple observed-data parameterizations, including variants that avoid explicit modeling of the mediator density, and are designed to be compatible with flexible machine-learning nuisance estimators. The authors derive second-order remainder bounds to establish root-n consistency and asymptotic linearity, develop doubly robust tests for the front-door assumptions within a semiparametric extension that incorporates generalized (Verma) independence constraints, and demonstrate efficiency gains from those constraints. Finite-sample performance is assessed via simulations, and practical utility is illustrated with applications to education and emergency-medicine data.

Significance. If the second-order remainder derivations and the double-robustness properties hold, the work supplies practically useful, ML-compatible tools for causal estimation when unmeasured confounding precludes back-door adjustment but the front-door criterion applies. The multiple parameterizations (especially those bypassing the mediator density) and the explicit remainder bounds reduce reliance on strong parametric assumptions and provide verifiable conditions for asymptotic linearity. The accompanying tests for identification assumptions and the efficiency results from the Verma constraints are additional contributions that could be adopted in applied work.

major comments (2)

[§4] §4 (asymptotic theory): the second-order remainder bounds are load-bearing for the root-n consistency claim. The manuscript must explicitly verify that the product of nuisance estimation rates remains o_p(n^{-1/2}) for each of the proposed parameterizations, including the versions that avoid modeling the mediator density; without this verification the conditions for asymptotic linearity are not fully established for the ML-compatible estimators.
[§5.2] §5.2 (testing procedure): the doubly robust test for the front-door identification assumptions relies on the semiparametric extension with Verma constraints. The construction of the test statistic and the precise form of double robustness should be stated with an explicit influence-function representation so that readers can confirm the claimed robustness property under the stated model.

minor comments (3)

[§3] Notation for the multiple observed-data parameterizations (e.g., the distinct expressions for the efficient influence function) should be introduced with a single consolidated table or display to improve readability across Sections 3 and 4.
[§6] The simulation section would benefit from reporting the exact nuisance estimators (e.g., specific ML algorithms and tuning) and the precise sample sizes used for each scenario so that the favorable finite-sample results can be reproduced.
[§7] A few typographical inconsistencies appear in the real-data application descriptions (variable names and sample-size reporting); these should be harmonized with the corresponding tables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and constructive comments on our manuscript. The suggestions regarding explicit verification of rate conditions and the influence-function representation for the test will improve clarity. We address each major comment below.

read point-by-point responses

Referee: [§4] §4 (asymptotic theory): the second-order remainder bounds are load-bearing for the root-n consistency claim. The manuscript must explicitly verify that the product of nuisance estimation rates remains o_p(n^{-1/2}) for each of the proposed parameterizations, including the versions that avoid modeling the mediator density; without this verification the conditions for asymptotic linearity are not fully established for the ML-compatible estimators.

Authors: We appreciate the referee's emphasis on making the rate conditions fully explicit. Section 4 derives the second-order remainder bounds for all four observed-data parameterizations (including the two that avoid explicit modeling of the mediator density). Under the standard assumption that each nuisance estimator converges at rate o_p(n^{-1/4}), the product terms are o_p(n^{-1/2}) by construction. To strengthen the presentation, we will add a short dedicated paragraph (or remark) in the revised Section 4 that explicitly verifies the product-rate condition for each parameterization separately. revision: yes
Referee: [§5.2] §5.2 (testing procedure): the doubly robust test for the front-door identification assumptions relies on the semiparametric extension with Verma constraints. The construction of the test statistic and the precise form of double robustness should be stated with an explicit influence-function representation so that readers can confirm the claimed robustness property under the stated model.

Authors: We agree that an explicit influence-function representation will make the double-robustness property transparent. In the revised Section 5.2 we will state the influence function of the test statistic and briefly derive how the double robustness follows from the semiparametric model that incorporates the Verma constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper constructs one-step and TMLE estimators for ATE/ATT under the standard front-door criterion using multiple observed-data parameterizations (including mediator-density-free forms) and derives explicit second-order remainder bounds to establish root-n consistency and asymptotic linearity. These steps apply standard semiparametric efficiency theory to the front-door model; no equation reduces to a fitted input by construction, no load-bearing self-citation chain is invoked for uniqueness or ansatz, and the identification assumptions are stated as external requirements rather than derived internally. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of the front-door identification assumptions and the semiparametric extension that encodes generalized independence constraints; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption The front-door assumptions hold: there exists a mediator that fully mediates the treatment effect on the outcome and is unaffected by unmeasured confounders of the treatment-outcome relationship.
This is the core identification assumption invoked for the causal effects to be identified from the observed data distribution.

pith-pipeline@v0.9.0 · 5735 in / 1320 out tokens · 37223 ms · 2026-05-24T05:17:19.501928+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

Balke and J

A. Balke and J. Pearl. Counterfactual probabilities: Computational methods, bounds and applications. In Proceedings of UAI-94, pages 46--54, 1994

work page 1994
[2]

M. F. Bellemare, J. R. Bloem, and N. Wexler. The paper of how: Estimating treatment effects using the front-door criterion. Technical report, Working paper, 2019

work page 2019
[3]

Benkeser and M

D. Benkeser and M. Van Der Laan. The highly adaptive lasso estimator. In 2016 IEEE international conference on data science and advanced analytics (DSAA), pages 689--696. IEEE, 2016

work page 2016
[4]

Bhattacharya and R

R. Bhattacharya and R. Nabi. On testability of the front-door model via verma constraints. In Uncertainty in Artificial Intelligence, pages 202--212. PMLR, 2022

work page 2022
[5]

Bhattacharya, R

R. Bhattacharya, R. Nabi, and I. Shpitser. Semiparametric inference for causal effects in graphical models with hidden variables. Journal of Machine Learning Research, 23: 0 1--76, 2022

work page 2022
[6]

P. J. Bickel, C. A. Klaassen, Y. Ritov, and J. A. Wellner. Efficient and adaptive estimation for semiparametric models, volume 4. Johns Hopkins University Press Baltimore, 1993

work page 1993
[7]

Chernozhukov, D

V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 2017

work page 2017
[8]

I. R. Fulcher, I. Shpitser, S. Marealle, and E. J. Tchetgen Tchetgen . Robust inference on population indirect causal effects: The generalized front-door criterion. Journal of the Royal Statistical Society, Series B, 2019

work page 2019
[9]

Glynn and K

A. Glynn and K. Kashin. Front-door versus back-door adjustment with unmeasured confounding: Bias formulas for front-door and hybrid adjustments. In 71st Annual Conference of the Midwest Political Science Association, volume 3, 2013

work page 2013
[10]

A. N. Glynn and K. Kashin. Front-door versus back-door adjustment with unmeasured confounding: Bias formulas for front-door and hybrid adjustments with application to a job training program. Journal of the American Statistical Association, 113 0 (523): 0 1040--1049, 2018

work page 2018
[11]

J. Hahn. On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica, pages 315--331, 1998

work page 1998
[12]

Hayfield and J

T. Hayfield and J. S. Racine. Nonparametric econometrics: The np package. Journal of statistical software, 27: 0 1--32, 2008

work page 2008
[13]

M. A. Hern \'a n and J. M. Robins. Estimating causal effects from epidemiological data. Journal of Epidemiology & Community Health, 60 0 (7): 0 578--586, 2006

work page 2006
[14]

Hirano, G

K. Hirano, G. W. Imbens, and G. Ridder. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71 0 (4): 0 1161--1189, 2003

work page 2003
[15]

Huang and M

Y. Huang and M. Valtorta. Pearl's calculus of interventions is complete. In Twenty Second Conference On Uncertainty in Artificial Intelligence, 2006

work page 2006
[16]

K. Jorma. Life course 1971-2002 [dataset]. version 2.0, 2018. Finnish Social Science Data Archive [distributor]. http://urn.fi/urn:nbn:fi:fsd:T-FSD2076

work page 1971
[17]

Kanamori, S

T. Kanamori, S. Hido, and M. Sugiyama. A least-squares approach to direct importance estimation. The Journal of Machine Learning Research, 10: 0 1391--1445, 2009

work page 2009
[18]

E. H. Kennedy. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469, 2022

work page arXiv 2022
[19]

C. F. Manski. Nonparametric bounds on treatment effects. The American Economic Review, 80 0 (2): 0 319--323, 1990

work page 1990
[20]

J. Neyman. Sur les applications de la thar des probabilities aux experiences agaricales: Essay des principle. excerpts reprinted (1990) in E nglish. Statistical Science, 5: 0 463--472, 1923

work page 1990
[21]

J. Pearl. Causal diagrams for empirical research. Biometrika, 82 0 (4): 0 669--688, 1995 a

work page 1995
[22]

J. Pearl. Causal diagrams for empirical research. Biometrika, 82 0 (4): 0 669--709, 1995 b . URL citeseer.ist.psu.edu/55450.html

work page 1995
[23]

J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2 edition, 2009. ISBN 978-0521895606

work page 2009
[24]

T. S. Richardson and J. M. Robins. Single world intervention graphs ( SWIG s): A unification of the counterfactual and graphical approaches to causality. 2013

work page 2013
[25]

T. S. Richardson, R. J. Evans, J. M. Robins, and I. Shpitser. Nested markov properties for acyclic directed mixed graphs. arXiv preprint arXiv:1701.06686, 2017

work page arXiv 2017
[26]

J. M. Robins. A new approach to causal inference in mortality studies with sustained exposure periods -- application to control of the healthy worker survivor effect. Mathematical Modeling, 7: 0 1393--1512, 1986

work page 1986
[27]

J. M. Robins, A. Rotnitzky, and L. P. Zhao. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89 0 (427): 0 846--866, 1994 a

work page 1994
[28]

J. M. Robins, A. Rotnitzky, and L. P. Zhao. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89: 0 846--866, 1994 b

work page 1994
[29]

J. M. Robins, A. Rotnitzky, and D. O. Scharfstein. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In Statistical models in epidemiology, the environment, and clinical trials, pages 1--94. Springer, 2000

work page 2000
[30]

P. R. Rosenbaum and D. B. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70: 0 41--55, 1983

work page 1983
[31]

D. B. Rubin. Estimating causal effects of treatments in randomized and non-randomized studies. Journal of Educational Psychology, 66: 0 688--701, 1974

work page 1974
[32]

D. O. Scharfstein, R. Nabi, E. H. Kennedy, M.-Y. Huang, M. Bonvini, and M. Smid. Semiparametric sensitivity analysis: Unmeasured confounding in observational studies. arXiv preprint arXiv:2104.08300, 2021

work page arXiv 2021
[33]

Shpitser and J

I. Shpitser and J. Pearl. Identification of joint interventional distributions in recursive semi- M arkovian causal models. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06). AAAI Press, Palo Alto, 2006

work page 2006
[34]

Sugiyama, S

M. Sugiyama, S. Nakajima, H. Kashima, P. Buenau, and M. Kawanabe. Direct importance estimation with model selection and its application to covariate shift adaptation. Advances in neural information processing systems, 20, 2007

work page 2007
[35]

Sugiyama, M

M. Sugiyama, M. Kawanabe, and P. L. Chui. Dimensionality reduction for density ratio estimation in high-dimensional spaces. Neural Networks, 23 0 (1): 0 44--59, 2010

work page 2010
[36]

Tian and J

J. Tian and J. Pearl. A general identification condition for causal effects. In Eighteenth National Conference on Artificial Intelligence, pages 567--573, 2002. ISBN 0-262-51129-0

work page 2002
[37]

A. Tsiatis. Semiparametric theory and missing data. Springer Science & Business Media, 2007

work page 2007
[38]

M. J. van der Laan and D. Rubin. Targeted maximum likelihood learning. The International Journal of Biostatistics, 2 0 (1), 2006

work page 2006
[39]

M. J. Van der Laan, E. C. Polley, and A. E. Hubbard. Super learner. Statistical applications in genetics and molecular biology, 6 0 (1), 2007

work page 2007
[40]

M. J. van der Laan , S. Rose, et al. Targeted learning: causal inference for observational and experimental data, volume 4. Springer, 2011

work page 2011
[41]

van der Vaart and J

A. van der Vaart and J. A. Wellner. Empirical processes. In Weak Convergence and Empirical Processes: With Applications to Statistics, pages 127--384. Springer, 2023

work page 2023
[42]

A. W. van der Vaart . Asymptotic S tatistics , volume 3. Cambridge University Press, 2000

work page 2000
[43]

T. S. Verma and J. Pearl. Equivalence and synthesis of causal models. Technical Report R-150, Department of Computer Science, University of California, Los Angeles, 1990

work page 1990
[44]

L. Wen, A. L. Sarvet, and M. J. Stensrud. Causal effects of intervening variables in settings with unmeasured confounding. arXiv preprint arXiv:2305.00349, 2023

work page arXiv 2023
[45]

Yamada, T

M. Yamada, T. Suzuki, T. Kanamori, H. Hachiya, and M. Sugiyama. Relative density-ratio estimation for robust distribution comparison. Neural computation, 25 0 (5): 0 1324--1370, 2013

work page 2013
[46]

Zheng and M

W. Zheng and M. J. Van Der Laan. Asymptotic theory for cross-validated targeted maximum likelihood estimation. 2010

work page 2010

[1] [1]

Balke and J

A. Balke and J. Pearl. Counterfactual probabilities: Computational methods, bounds and applications. In Proceedings of UAI-94, pages 46--54, 1994

work page 1994

[2] [2]

M. F. Bellemare, J. R. Bloem, and N. Wexler. The paper of how: Estimating treatment effects using the front-door criterion. Technical report, Working paper, 2019

work page 2019

[3] [3]

Benkeser and M

D. Benkeser and M. Van Der Laan. The highly adaptive lasso estimator. In 2016 IEEE international conference on data science and advanced analytics (DSAA), pages 689--696. IEEE, 2016

work page 2016

[4] [4]

Bhattacharya and R

R. Bhattacharya and R. Nabi. On testability of the front-door model via verma constraints. In Uncertainty in Artificial Intelligence, pages 202--212. PMLR, 2022

work page 2022

[5] [5]

Bhattacharya, R

R. Bhattacharya, R. Nabi, and I. Shpitser. Semiparametric inference for causal effects in graphical models with hidden variables. Journal of Machine Learning Research, 23: 0 1--76, 2022

work page 2022

[6] [6]

P. J. Bickel, C. A. Klaassen, Y. Ritov, and J. A. Wellner. Efficient and adaptive estimation for semiparametric models, volume 4. Johns Hopkins University Press Baltimore, 1993

work page 1993

[7] [7]

Chernozhukov, D

V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 2017

work page 2017

[8] [8]

I. R. Fulcher, I. Shpitser, S. Marealle, and E. J. Tchetgen Tchetgen . Robust inference on population indirect causal effects: The generalized front-door criterion. Journal of the Royal Statistical Society, Series B, 2019

work page 2019

[9] [9]

Glynn and K

A. Glynn and K. Kashin. Front-door versus back-door adjustment with unmeasured confounding: Bias formulas for front-door and hybrid adjustments. In 71st Annual Conference of the Midwest Political Science Association, volume 3, 2013

work page 2013

[10] [10]

A. N. Glynn and K. Kashin. Front-door versus back-door adjustment with unmeasured confounding: Bias formulas for front-door and hybrid adjustments with application to a job training program. Journal of the American Statistical Association, 113 0 (523): 0 1040--1049, 2018

work page 2018

[11] [11]

J. Hahn. On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica, pages 315--331, 1998

work page 1998

[12] [12]

Hayfield and J

T. Hayfield and J. S. Racine. Nonparametric econometrics: The np package. Journal of statistical software, 27: 0 1--32, 2008

work page 2008

[13] [13]

M. A. Hern \'a n and J. M. Robins. Estimating causal effects from epidemiological data. Journal of Epidemiology & Community Health, 60 0 (7): 0 578--586, 2006

work page 2006

[14] [14]

Hirano, G

K. Hirano, G. W. Imbens, and G. Ridder. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71 0 (4): 0 1161--1189, 2003

work page 2003

[15] [15]

Huang and M

Y. Huang and M. Valtorta. Pearl's calculus of interventions is complete. In Twenty Second Conference On Uncertainty in Artificial Intelligence, 2006

work page 2006

[16] [16]

K. Jorma. Life course 1971-2002 [dataset]. version 2.0, 2018. Finnish Social Science Data Archive [distributor]. http://urn.fi/urn:nbn:fi:fsd:T-FSD2076

work page 1971

[17] [17]

Kanamori, S

T. Kanamori, S. Hido, and M. Sugiyama. A least-squares approach to direct importance estimation. The Journal of Machine Learning Research, 10: 0 1391--1445, 2009

work page 2009

[18] [18]

E. H. Kennedy. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469, 2022

work page arXiv 2022

[19] [19]

C. F. Manski. Nonparametric bounds on treatment effects. The American Economic Review, 80 0 (2): 0 319--323, 1990

work page 1990

[20] [20]

J. Neyman. Sur les applications de la thar des probabilities aux experiences agaricales: Essay des principle. excerpts reprinted (1990) in E nglish. Statistical Science, 5: 0 463--472, 1923

work page 1990

[21] [21]

J. Pearl. Causal diagrams for empirical research. Biometrika, 82 0 (4): 0 669--688, 1995 a

work page 1995

[22] [22]

J. Pearl. Causal diagrams for empirical research. Biometrika, 82 0 (4): 0 669--709, 1995 b . URL citeseer.ist.psu.edu/55450.html

work page 1995

[23] [23]

J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2 edition, 2009. ISBN 978-0521895606

work page 2009

[24] [24]

T. S. Richardson and J. M. Robins. Single world intervention graphs ( SWIG s): A unification of the counterfactual and graphical approaches to causality. 2013

work page 2013

[25] [25]

T. S. Richardson, R. J. Evans, J. M. Robins, and I. Shpitser. Nested markov properties for acyclic directed mixed graphs. arXiv preprint arXiv:1701.06686, 2017

work page arXiv 2017

[26] [26]

J. M. Robins. A new approach to causal inference in mortality studies with sustained exposure periods -- application to control of the healthy worker survivor effect. Mathematical Modeling, 7: 0 1393--1512, 1986

work page 1986

[27] [27]

J. M. Robins, A. Rotnitzky, and L. P. Zhao. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89 0 (427): 0 846--866, 1994 a

work page 1994

[28] [28]

J. M. Robins, A. Rotnitzky, and L. P. Zhao. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89: 0 846--866, 1994 b

work page 1994

[29] [29]

J. M. Robins, A. Rotnitzky, and D. O. Scharfstein. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In Statistical models in epidemiology, the environment, and clinical trials, pages 1--94. Springer, 2000

work page 2000

[30] [30]

P. R. Rosenbaum and D. B. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70: 0 41--55, 1983

work page 1983

[31] [31]

D. B. Rubin. Estimating causal effects of treatments in randomized and non-randomized studies. Journal of Educational Psychology, 66: 0 688--701, 1974

work page 1974

[32] [32]

D. O. Scharfstein, R. Nabi, E. H. Kennedy, M.-Y. Huang, M. Bonvini, and M. Smid. Semiparametric sensitivity analysis: Unmeasured confounding in observational studies. arXiv preprint arXiv:2104.08300, 2021

work page arXiv 2021

[33] [33]

Shpitser and J

I. Shpitser and J. Pearl. Identification of joint interventional distributions in recursive semi- M arkovian causal models. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06). AAAI Press, Palo Alto, 2006

work page 2006

[34] [34]

Sugiyama, S

M. Sugiyama, S. Nakajima, H. Kashima, P. Buenau, and M. Kawanabe. Direct importance estimation with model selection and its application to covariate shift adaptation. Advances in neural information processing systems, 20, 2007

work page 2007

[35] [35]

Sugiyama, M

M. Sugiyama, M. Kawanabe, and P. L. Chui. Dimensionality reduction for density ratio estimation in high-dimensional spaces. Neural Networks, 23 0 (1): 0 44--59, 2010

work page 2010

[36] [36]

Tian and J

J. Tian and J. Pearl. A general identification condition for causal effects. In Eighteenth National Conference on Artificial Intelligence, pages 567--573, 2002. ISBN 0-262-51129-0

work page 2002

[37] [37]

A. Tsiatis. Semiparametric theory and missing data. Springer Science & Business Media, 2007

work page 2007

[38] [38]

M. J. van der Laan and D. Rubin. Targeted maximum likelihood learning. The International Journal of Biostatistics, 2 0 (1), 2006

work page 2006

[39] [39]

M. J. Van der Laan, E. C. Polley, and A. E. Hubbard. Super learner. Statistical applications in genetics and molecular biology, 6 0 (1), 2007

work page 2007

[40] [40]

M. J. van der Laan , S. Rose, et al. Targeted learning: causal inference for observational and experimental data, volume 4. Springer, 2011

work page 2011

[41] [41]

van der Vaart and J

A. van der Vaart and J. A. Wellner. Empirical processes. In Weak Convergence and Empirical Processes: With Applications to Statistics, pages 127--384. Springer, 2023

work page 2023

[42] [42]

A. W. van der Vaart . Asymptotic S tatistics , volume 3. Cambridge University Press, 2000

work page 2000

[43] [43]

T. S. Verma and J. Pearl. Equivalence and synthesis of causal models. Technical Report R-150, Department of Computer Science, University of California, Los Angeles, 1990

work page 1990

[44] [44]

L. Wen, A. L. Sarvet, and M. J. Stensrud. Causal effects of intervening variables in settings with unmeasured confounding. arXiv preprint arXiv:2305.00349, 2023

work page arXiv 2023

[45] [45]

Yamada, T

M. Yamada, T. Suzuki, T. Kanamori, H. Hachiya, and M. Sugiyama. Relative density-ratio estimation for robust distribution comparison. Neural computation, 25 0 (5): 0 1324--1370, 2013

work page 2013

[46] [46]

Zheng and M

W. Zheng and M. J. Van Der Laan. Asymptotic theory for cross-validated targeted maximum likelihood estimation. 2010

work page 2010