Efficient Estimation of Average Treatment Effect on the Treated under Endogenous Treatment Assignment

Jiawei Shan; Jiwei Zhao; Menggang Yu; Trinetri Ghosh

arxiv: 2307.01908 · v2 · submitted 2023-07-04 · 📊 stat.ME

Efficient Estimation of Average Treatment Effect on the Treated under Endogenous Treatment Assignment

Trinetri Ghosh , Jiawei Shan , Menggang Yu , Jiwei Zhao This is my paper

Pith reviewed 2026-05-24 08:17 UTC · model grok-4.3

classification 📊 stat.ME

keywords average treatment effect on the treatedendogenous treatment assignmentshadow variablessemiparametric efficiency boundcausal inferenceefficient estimation

0 comments

The pith

Shadow variables unrelated to treatment but linked to outcomes identify the ATT under endogenous assignment and support an estimator attaining the semiparametric efficiency bound.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the average treatment effect on the treated remains identifiable even when treatment assignment depends on unmeasured factors. Identification rests on the existence of shadow variables that share no direct link with treatment choice yet correlate with the outcomes. From there the authors examine the geometry of the likelihood to locate the lowest attainable variance for estimators of this quantity and build an estimator that reaches the bound. This matters for policy questions in which treatments are chosen rather than randomly assigned, because it supplies both a way to recover the effect and a guarantee that the estimator uses the data as efficiently as possible. The work supplies asymptotic theory for the estimator together with simulation checks and a real-data illustration.

Core claim

By introducing shadow variables that are unrelated to the treatment assignment but related to the outcomes, the average treatment effect on the treated becomes identifiable under endogenous selection. Characterizing the geometric structure of the likelihood then produces the semiparametric efficiency bound for ATT estimation, and an estimator is constructed that attains this bound.

What carries the argument

Shadow variables (variables independent of treatment assignment yet associated with outcomes), which supply the identifying variation and allow derivation of the efficiency bound from the likelihood geometry.

If this is right

The ATT is identified without requiring unconfoundedness or an instrument directly linked to treatment.
An estimator exists that is asymptotically efficient for the ATT, attaining the semiparametric bound.
The estimator satisfies consistency and asymptotic normality under the stated conditions.
Finite-sample behavior is reliable in simulations and in the motivating empirical example.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar shadow-variable arguments might identify other policy-relevant quantities such as the average treatment effect on the untreated.
Applications will require subject-matter knowledge to locate variables that are plausibly unrelated to treatment choice.
The approach could be compared with instrumental-variable methods when the latter rely on weak instruments.
Sensitivity checks that vary the assumed strength of the shadow-variable outcome association would be a natural practical extension.

Load-bearing premise

Valid shadow variables exist that bear no relation to treatment assignment yet relate to the outcomes of interest.

What would settle it

In data generated exactly from a model satisfying the shadow-variable independence and relevance conditions, the proposed estimator fails to match the derived efficiency bound in large samples.

read the original abstract

In this paper, we consider estimation of average treatment effect on the treated (ATT), an interpretable and relevant causal estimand to policy makers when treatment assignment is endogenous. By considering shadow variables that are unrelated to the treatment assignment but related to the outcomes of interest, we establish identification of the ATT. Then we focus on efficient estimation of the ATT by characterizing the geometric structure of the likelihood, deriving the semiparametric efficiency bound for ATT estimation and proposing an estimator that can achieve this bound. We rigorously establish the theoretical results of the proposed estimator. The finite sample performance of the proposed estimator is studied through comprehensive simulation studies as well as an application to our motivating study.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper uses shadow variables to identify ATT under endogenous treatment and derives the semiparametric efficiency bound, but the approach stands or falls on whether those variables can be found and justified in practice.

read the letter

The main contribution here is the use of shadow variables—variables independent of treatment assignment but associated with the outcome—to identify the ATT when assignment is endogenous. They then characterize the tangent space of the observed-data likelihood to get the efficiency bound and build an estimator that attains it. This is framed as new for the ATT specifically rather than the ATE. The simulations and the application to their motivating example give some sense of finite-sample behavior, which is a plus for a methods paper. The theoretical claims are stated directly, and the geometric likelihood argument appears standard rather than circular. The citation pattern follows the usual semiparametric causal inference references without obvious gaps or self-referential loops. The central soft spot is the identifying assumption itself. Shadow variables must satisfy the independence-from-treatment and dependence-on-outcome conditions, and the abstract treats their existence as given. In applications this could prove as hard to defend as a valid instrument, and the paper would be stronger with more concrete guidance on finding or testing them. Without the full derivations it is also unclear whether any regularity conditions are left implicit in the efficiency bound. This is aimed at statisticians working on causal methods with endogeneity who care about the treated subpopulation. A reader already comfortable with semiparametric efficiency arguments would get the most from it. I would send it to peer review; the topic is relevant and the strategy is coherent enough that referees can usefully check the technical details and the strength of the identifying conditions.

Referee Report

2 major / 2 minor

Summary. The paper proposes identifying the average treatment effect on the treated (ATT) under endogenous treatment assignment via shadow variables that are independent of treatment but associated with outcomes. It then characterizes the geometric structure of the observed-data likelihood to derive the semiparametric efficiency bound for ATT and constructs an estimator attaining the bound, with accompanying asymptotic theory, simulation studies, and a real-data application.

Significance. If the shadow-variable conditions hold and the efficiency-bound derivation is correct, the work supplies a principled semiparametric route to ATT estimation in settings where standard unconfoundedness fails, which is relevant for policy evaluation. The explicit tangent-space argument and attainment result would be a useful addition to the literature on efficient estimation under partial identification or auxiliary variables.

major comments (2)

[§3.2] §3.2, identification result: the paper states that shadow variables Z satisfy Z ⊥ A | X but Z ⊥̸ Y | A,X; however, the precise conditional independence statements used to recover the ATT functional from the observed-data distribution are not stated as a formal theorem with all regularity conditions, making it difficult to verify whether the mapping from the shadow-variable model to the ATT parameter is one-to-one without additional assumptions on the support of Z.
[§4.3] §4.3, efficiency bound derivation: the tangent-space projection that yields the efficient influence function for ATT appears to rely on the score for the conditional density of Y given A,X,Z; it is unclear whether the resulting EIF remains valid when the shadow variables are high-dimensional or when the conditional independence Z ⊥ A | X is only approximate, which would affect the claim that the proposed estimator attains the bound under the stated model.

minor comments (2)

Notation for the observed-data likelihood and the nuisance functions (e.g., propensity score, outcome regressions) is introduced piecemeal; a single table collecting all symbols and their definitions would improve readability.
[§5] The simulation section reports bias and RMSE but does not include coverage probabilities for the proposed confidence intervals; adding these would strengthen the finite-sample evidence.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments. We address each major comment below and indicate planned revisions to the manuscript.

read point-by-point responses

Referee: [§3.2] §3.2, identification result: the paper states that shadow variables Z satisfy Z ⊥ A | X but Z ⊥̸ Y | A,X; however, the precise conditional independence statements used to recover the ATT functional from the observed-data distribution are not stated as a formal theorem with all regularity conditions, making it difficult to verify whether the mapping from the shadow-variable model to the ATT parameter is one-to-one without additional assumptions on the support of Z.

Authors: We agree that a formal statement would enhance verifiability. In the revised manuscript we will restate the identification result as a formal theorem that explicitly lists the conditional independence conditions (Z ⊥ A | X and Z ⊥̸ Y | A, X), all regularity conditions, and the required support assumptions on Z that guarantee the mapping from the observed-data distribution to the ATT is one-to-one. revision: yes
Referee: [§4.3] §4.3, efficiency bound derivation: the tangent-space projection that yields the efficient influence function for ATT appears to rely on the score for the conditional density of Y given A,X,Z; it is unclear whether the resulting EIF remains valid when the shadow variables are high-dimensional or when the conditional independence Z ⊥ A | X is only approximate, which would affect the claim that the proposed estimator attains the bound under the stated model.

Authors: The EIF and efficiency bound are derived under the exact model in which Z ⊥ A | X holds precisely and Z is of fixed, low dimension. The tangent-space argument is therefore valid inside this model. When Z is high-dimensional or the independence holds only approximately, the stated EIF does not apply and additional techniques would be required. We will insert a clarifying remark in §4.3 that explicitly delimits the scope of the result to the exact, low-dimensional model. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper's central chain proceeds from an external identifying assumption (existence of shadow variables independent of treatment but dependent on outcomes) to identification of ATT, followed by a standard semiparametric construction: characterization of the tangent space of the observed-data likelihood, derivation of the efficiency bound, and construction of an estimator attaining it. No step reduces by definition to its own fitted parameters, renames a known result as a new derivation, or relies on a load-bearing self-citation whose content is itself unverified within the paper. The geometric argument for the efficiency bound is presented as following from the likelihood structure under the stated assumptions and is therefore independent of the target estimand by construction. This is the normal, non-circular case for a semiparametric efficiency analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests primarily on the domain assumption of valid shadow variables for identification; no free parameters or invented entities beyond the shadow variables concept are evident from the abstract.

axioms (1)

domain assumption Existence of shadow variables unrelated to treatment assignment but related to outcomes of interest
Invoked to establish identification of the ATT under endogenous treatment (abstract).

invented entities (1)

shadow variables no independent evidence
purpose: Enable identification of ATT when treatment assignment is endogenous
New concept introduced in the paper for this identification purpose.

pith-pipeline@v0.9.0 · 5644 in / 1192 out tokens · 27202 ms · 2026-05-24T08:17:09.158070+00:00 · methodology

Efficient Estimation of Average Treatment Effect on the Treated under Endogenous Treatment Assignment

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)