Instrumental Variable Analysis Without Structural Equations

Arthur Gretton; Aur\'elien Bibaut; Dimitri Meunier; Houssam Zenati; Nathan Kallus; Zikai Shen

arxiv: 2604.24660 · v1 · submitted 2026-04-27 · 📊 stat.ML · math.ST· stat.ME· stat.TH

Instrumental Variable Analysis Without Structural Equations

Zikai Shen , Dimitri Meunier , Houssam Zenati , Arthur Gretton , Nathan Kallus , Aur\'elien Bibaut This is my paper

Pith reviewed 2026-05-07 17:50 UTC · model grok-4.3

classification 📊 stat.ML math.STstat.MEstat.TH

keywords instrumental variablesdebiased inferenceinverse problemsleast squaresstructural modelscausal inferencemisspecificationrobust estimation

0 comments

The pith

Debiased inference on least-squares solutions to inverse problems remains valid for instrumental variables even when structural equations hold only approximately.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets inference on the least-squares solution to an inverse problem rather than on quantities that require an exact solution to a structural equation. This least-squares target is always defined inside the statistical model and equals the usual estimand whenever an exact solution happens to exist. Traditional IV methods can lose validity when the motivating structural model is misspecified, but the new procedure separates the motivational role of structural models from the statistical validity of the inference. Readers would care because many applied datasets satisfy structural models only approximately yet still require reliable uncertainty statements.

Core claim

We consider debiased inference on least-squares solutions to inverse problems as a way to avoid having to assume exact solutions exist. Such assumptions are substantive and not innocuous and their failure may well imperil inference when we impose them on the statistical model. Our approach instead allows us to conduct inference on a quantity that is defined regardless of solutions existing and coincides with the usual estimands when they do. For the case of instrumental variables, this means we can motivate the analysis with structural models but these do not need to hold exactly for the inferential procedure to remain valid.

What carries the argument

Debiased inference targeting the least-squares solution to the inverse problem, which stays well-defined in the statistical model without requiring exact structural solutions.

If this is right

Inference procedures remain valid under approximate rather than exact structural assumptions.
The target quantity coincides with traditional IV estimands whenever the structural model holds exactly.
Structural models can be used for motivation and interpretation without endangering statistical validity.
The same debiased framework applies to other inverse problems where exact solutions are not guaranteed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may simplify robustness checks by letting analysts focus on coverage for the statistical target instead of structural parameters.
It could extend naturally to other settings that currently rely on exact solutions to ill-posed inverse problems.
Practitioners might combine the method with partial-identification bounds when even the least-squares target is only partially identified.

Load-bearing premise

The least-squares solution to the inverse problem is a well-defined and estimable target, and the debiased inference procedure works without needing additional exact structural constraints.

What would settle it

Generate data from a distribution with no exact structural-equation solution, run the debiased procedure, and check whether the resulting confidence intervals achieve nominal coverage for the known least-squares target.

read the original abstract

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes debiased inference targeting the least-squares solution to an inverse problem (e.g., the population minimizer of E[(Y - Xθ)^2] or its IV analogue) rather than requiring an exact structural solution to exist. The central claim is that this target is always well-defined in the statistical model, coincides with classical IV estimands when exact solutions exist, and permits valid inference even when structural equations fail to hold exactly.

Significance. If the asymptotic normality and consistency results hold under the stated conditions, the approach would allow IV analyses to be motivated by structural models without requiring those models to be exactly correct, reducing the risk that misspecification invalidates inference. This is a substantive relaxation of a common assumption in causal inference and could broaden the applicability of IV methods in settings where exact linearity or invertibility is implausible.

major comments (2)

[Section 2 (target definition) and main theorem] The definition of the target θ* as arg min_θ E[(Y - Xθ)^2] (or IV analogue) does not specify a selection rule when the relevant design operator has a non-trivial kernel. Section 2 and the statement of the main result (around the debiased estimator) implicitly treat θ* as unique for the influence function derivation and asymptotic normality to be well-defined; without an explicit minimum-norm or other tie-breaker, the target is not uniquely identified in the statistical model and the validity claim without structural constraints does not follow.
[Introduction and Section 4 (equivalence)] The claim that the procedure 'coincides with the usual estimands when they do' (abstract and introduction) requires showing that, under the additional assumption that an exact solution exists, the least-squares target equals the structural parameter and the debiased estimator recovers the same asymptotic distribution as standard IV estimators. No such equivalence derivation is provided, leaving open whether the relaxation preserves the classical properties when the stronger assumption holds.

minor comments (2)

[Section 1] Notation for the population objective and the inverse problem could be introduced with an explicit displayed equation early in the paper to improve readability.
[Abstract / empirical section] The abstract states the motivation clearly but the paper would benefit from a short simulation or numerical example illustrating behavior when the structural equation fails but the LS target remains estimable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which help clarify key aspects of our target definition and the relationship to classical IV estimands. We respond point by point to the major comments below and will make the indicated revisions to the manuscript.

read point-by-point responses

Referee: [Section 2 (target definition) and main theorem] The definition of the target θ* as arg min_θ E[(Y - Xθ)^2] (or IV analogue) does not specify a selection rule when the relevant design operator has a non-trivial kernel. Section 2 and the statement of the main result (around the debiased estimator) implicitly treat θ* as unique for the influence function derivation and asymptotic normality to be well-defined; without an explicit minimum-norm or other tie-breaker, the target is not uniquely identified in the statistical model and the validity claim without structural constraints does not follow.

Authors: We agree that the argmin set may not be a singleton when the design operator has a non-trivial kernel, and that uniqueness is needed for the influence function and asymptotic normality statements. In the revised manuscript we will explicitly define θ* as the minimum-norm solution (the unique element of minimal Euclidean norm within the argmin set). This is a standard and natural selection rule for inverse problems; it preserves the interpretation as the population least-squares projection while ensuring the target is uniquely identified in the statistical model. We will update the definition in Section 2, restate the main theorem for this choice of θ*, and confirm that the debiased estimator remains asymptotically normal for the minimum-norm target. This clarification does not restrict the scope of the paper but makes the claims rigorous. revision: yes
Referee: [Introduction and Section 4 (equivalence)] The claim that the procedure 'coincides with the usual estimands when they do' (abstract and introduction) requires showing that, under the additional assumption that an exact solution exists, the least-squares target equals the structural parameter and the debiased estimator recovers the same asymptotic distribution as standard IV estimators. No such equivalence derivation is provided, leaving open whether the relaxation preserves the classical properties when the stronger assumption holds.

Authors: We thank the referee for noting the absence of an explicit equivalence derivation. When an exact structural solution θ exists (i.e., the moment condition holds with equality), this θ is necessarily a minimizer of the population least-squares objective E[(Y − Xθ)^2]. Under the usual full-column-rank condition on the design, it is the unique minimizer, so the least-squares target θ* coincides with the structural parameter. In this case our debiased estimator reduces to the classical IV estimator (e.g., two-stage least squares) and therefore inherits the same asymptotic distribution. We will add a short proposition in Section 4 that formally derives this equivalence, including the matching of influence functions and asymptotic variances under the exact-solution assumption. This addition will confirm that the proposed relaxation is conservative: it recovers the classical results whenever the stronger structural assumptions hold. revision: yes

Circularity Check

0 steps flagged

No significant circularity; target and estimator defined independently

full rationale

The paper defines its target as the population least-squares solution to the inverse problem (always well-defined in the statistical model) and constructs a debiased estimator to conduct inference on that quantity. This target coincides with classical IV estimands only when exact solutions exist but is motivated and estimated without requiring those exact solutions. No quoted step reduces the claimed result to a fitted parameter renamed as a prediction, a self-definitional loop, or a load-bearing self-citation whose validity is assumed rather than independently verified. The derivation remains self-contained against the stated statistical model and regularity conditions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that a least-squares solution exists and is the appropriate target for inference; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption A least-squares solution to the inverse problem is well-defined in the statistical model under consideration.
This defines the target quantity on which inference is performed regardless of whether an exact solution exists.

pith-pipeline@v0.9.0 · 5402 in / 1149 out tokens · 38233 ms · 2026-05-07T17:50:51.341427+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Semiparametric Efficient Bilevel Gradient Estimation
stat.ML 2026-05 unverdicted novelty 7.0

Introduces a cross-fitted orthogonal hypergradient estimator derived from the efficient influence function that achieves asymptotic normality and uniform control for bilevel gradient estimation under quadratic losses.

Reference graph

Works this paper leans on

2 extracted references · cited by 1 Pith paper

[1]

For some constantR >0, we haveΠ [−R,R] = Id,
[2]

IfA n, n≥1is an arbitrary sequence of pairwise disjoint Borel subsets ofR, let A= [ n≥1 An ∈ B(R), and then we haveΠA =P n≥1 ΠAn, where the series converges in the strong operator topology ofH. We state the following theorem in terms of projection-valued measures: Theorem 7(Spectral theorem for bounded self-adjoint operators).Let H be a separable Hilbert ...

2023

[1] [1]

For some constantR >0, we haveΠ [−R,R] = Id,

[2] [2]

IfA n, n≥1is an arbitrary sequence of pairwise disjoint Borel subsets ofR, let A= [ n≥1 An ∈ B(R), and then we haveΠA =P n≥1 ΠAn, where the series converges in the strong operator topology ofH. We state the following theorem in terms of projection-valued measures: Theorem 7(Spectral theorem for bounded self-adjoint operators).Let H be a separable Hilbert ...

2023