Generated outcomes as generated regressors: Equivalences in recursive causal estimation

Rahul Singh; Wisse Rutgers

arxiv: 2606.29009 · v1 · pith:C5UIYMR3new · submitted 2026-06-27 · 📊 stat.ME · econ.EM

Generated outcomes as generated regressors: Equivalences in recursive causal estimation

Wisse Rutgers , Rahul Singh This is my paper

Pith reviewed 2026-06-30 08:41 UTC · model grok-4.3

classification 📊 stat.ME econ.EM

keywords recursive estimationcausal inferencegenerated regressorstime-varying treatmentsmediation analysisdoubly robust estimationbalancing weightsfinite-sample equivalence

0 comments

The pith

When every stage uses ordinary least squares, three recursive causal estimators coincide in any finite sample.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Time-varying treatment effects, surrogate-identified effects, and mediation effects can each be expressed as chains of regressions in which the fitted values from one stage serve as the outcome or regressor for the next. The paper compares the recursive plug-in estimator, the recursive balancing-weight estimator, and the recursive doubly robust estimator in this setting. When ordinary least squares is applied at every stage, the three estimators produce identical numerical results whether or not the regressions are correctly specified. This establishes a finite-sample equivalence between estimation that recursively regresses generated outcomes and estimation that recursively balances generated regressors. Under ridge penalization the doubly robust version reduces to a backward recursion that blends penalized and OLS fits at each stage, with the OLS weight decaying geometrically as the number of stages grows.

Core claim

Recursive causal estimation for time-varying treatments, surrogates, and mediation is achieved by writing each effect as a sequence of regressions whose outputs serve as inputs to the next. When each stage uses OLS, the recursive plug-in estimator, the recursive balancing estimator, and the recursive doubly robust estimator are numerically identical in every finite sample, correct specification or not. This establishes that estimation via generated outcomes is equivalent to estimation via generated regressors. Under ridge penalties the doubly robust form becomes a geometric blend of penalized and OLS fits whose OLS weight shrinks with the length of the chain. For general convex penalties an

What carries the argument

Recursive regressions in which the predicted values from one stage become the generated outcomes or regressors for the subsequent stage.

If this is right

The three recursive estimators are identical under OLS fitting in every finite sample.
Regressing generated outcomes is numerically equivalent to balancing generated regressors.
Under ridge penalization the doubly robust estimator is a backward recursion of stage-wise blends whose OLS weight decays geometrically with the number of periods.
For any convex penalty an identity relating the estimators holds at each stage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The finite-sample identity may allow practitioners to choose the computationally simplest of the three forms without changing the answer.
Geometric decay of the OLS weight implies that penalization dominates in very long chains even if the initial stages use OLS.
The same equivalence structure could be checked for other loss functions or for non-linear link functions at individual stages.

Load-bearing premise

Time-varying treatment effects, surrogate-identified treatment effects, and mediation effects can all be written as recursive regressions in which each regression's predicted values become generated outcomes for the next regression.

What would settle it

A concrete numerical example in which the OLS-fitted recursive plug-in, balancing, and doubly robust estimators produce different values on the same finite data set drawn from a recursive causal model.

read the original abstract

Time-varying treatment effects, surrogate-identified treatment effects, and mediation effects can all be written as recursive regressions, in which each regression's predicted values become generated outcomes for the next regression. We study how standard causal estimators behave in this setting. Formally, we compare the recursive plug-in, recursive balancing weight, and recursive doubly robust estimators. When every stage is fitted by ordinary least squares (OLS), the three recursive estimators coincide in any finite sample, whether or not the models are correctly specified. As such, estimation by recursively regressing generated outcomes is numerically equivalent to estimation by recursively balancing generated regressors. Under ridge penalisation for the balancing weights, the doubly robust estimator is a backward recursion of stage-wise blends of penalised and OLS regressions. The weight on the recursive OLS regression decays geometrically in the number of time periods. Therefore, the intuition from the cross-sectional setting, where the bias correction moves the estimator towards OLS, applies less and less as the number of time periods increases. For general convex penalties, we derive an identity at each stage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Under OLS the three recursive estimators coincide exactly in finite samples; the ridge case adds a geometric decay identity.

read the letter

The main takeaway is that when every stage is OLS, the recursive plug-in, balancing-weight, and doubly-robust estimators are numerically identical in any finite sample. This follows straight from the normal equations applied recursively and holds whether or not the models are correct.

The paper frames time-varying treatment effects, surrogate identification, and mediation as recursive regressions in which fitted values become the next-stage outcomes. It then compares the three estimator families in that setup and derives the OLS coincidence plus the ridge blending identity, where the weight on the OLS regression decays geometrically with the number of stages. For general convex penalties it gives a per-stage identity.

What is actually new is the explicit finite-sample numerical equivalence under OLS together with the geometric decay observation. These are not direct restatements of earlier generated-regressor results.

The derivations are algebraic, so the claims stand or fall on whether the lemmas check out without hidden regularity conditions. The modeling premise that these causal quantities can be written as recursive regressions with generated outcomes is definitional rather than tested, which limits the scope but is stated up front. No simulations or numerical checks appear in the abstract, which is typical for this style of paper but leaves the practical reach open.

This is for econometricians who work with dynamic or mediated causal effects and want to know the exact relationships among these recursive estimators. A reader implementing or analyzing such methods would get direct value from the identities. It deserves a serious referee because the results are precise and could simplify both coding and theory in this sub-area.

Referee Report

0 major / 3 minor

Summary. The manuscript claims that time-varying treatment effects, surrogate-identified treatment effects, and mediation effects can be expressed as recursive regressions in which each stage's fitted values become generated outcomes or regressors for the next. It compares the recursive plug-in, recursive balancing-weight, and recursive doubly-robust estimators. When every stage is estimated by OLS, the three estimators are numerically identical in any finite sample, whether or not the models are correctly specified. Under ridge penalization of the balancing weights, the doubly-robust estimator equals a backward recursion of stage-wise blends of penalized and OLS fits, with the OLS weight decaying geometrically in the number of periods. An identity is stated for each stage under general convex penalties.

Significance. If the algebraic identities hold, the paper unifies three families of causal estimators in recursive longitudinal and mediation settings by showing that OLS-based recursion renders plug-in and balancing approaches numerically equivalent without any correct-specification requirement. The geometric-decay result under ridge penalization supplies a concrete, testable prediction about how bias-correction strength changes with horizon length. The work is credited for deriving these finite-sample identities directly from the normal equations and convex-optimization properties rather than from probabilistic assumptions.

minor comments (3)

[Abstract] Abstract: the phrase 'an identity at each stage' for general convex penalties is stated without indicating what quantities are being equated; a one-sentence clarification would improve readability.
Notation for the horizon length (T versus stage index k) is used inconsistently across the recursive definitions; a single global symbol would reduce confusion.
[§3] The manuscript invokes the normal equations of OLS at each recursion but does not display the explicit induction step that confirms the generated regressor remains in the column space of the subsequent design matrix; adding this short verification would make the argument self-contained.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript and for recommending minor revision. No specific major comments or criticisms were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central claim is an algebraic identity derived from the normal equations of OLS applied recursively to generated outcomes and regressors. The three estimators coincide in finite samples by direct application of residual orthogonality to the column space at each stage, without any self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. The derivation is self-contained in convex optimization properties and holds regardless of correct specification, matching the default expectation of no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on standard properties of OLS and convex optimization together with the domain assumption that the target causal quantities admit a recursive regression representation. No free parameters, ad-hoc axioms, or new postulated entities are introduced in the abstract.

axioms (2)

standard math Ordinary least squares satisfies its normal equations in finite samples regardless of model correctness
Invoked to establish numerical identity of the three estimators.
domain assumption The causal quantities of interest admit an exact recursive regression representation
Stated explicitly in the first sentence of the abstract as the modeling premise.

pith-pipeline@v0.9.1-grok · 5710 in / 1363 out tokens · 39499 ms · 2026-06-30T08:41:39.281020+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 7 canonical work pages · 2 internal anchors

[1]

Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=

Augmented balancing weights as linear regression , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=. 2025 , publisher=

2025
[2]

Discussion of ``Augmented balancing weights as linear regression'' by

Shen, Zhuoer and Zubizarreta, Jos. Discussion of ``Augmented balancing weights as linear regression'' by. Journal of the Royal Statistical Society Series B: Statistical Methodology , year=
[3]

Discussion of ``Augmented balancing weights as linear regression'' by

Liu, Licheng , journal=. Discussion of ``Augmented balancing weights as linear regression'' by. 2025 , publisher=

2025
[4]

arXiv:2102.11076 , year=

Kernel ridge Riesz representers: Generalization, mis-specification, and the counterfactual effective dimension , author=. arXiv:2102.11076 , year=

work page arXiv
[5]

Causality and psychopathology: Finding the determinants of disorders and their cures , volume=

Alternative graphical causal models and the identification of direct effects , author=. Causality and psychopathology: Finding the determinants of disorders and their cures , volume=. 2010 , publisher=

2010
[6]

Biometrika , volume=

Multiple robustness in factorized likelihood models , author=. Biometrika , volume=. 2017 , publisher=

2017
[7]

Sequential Double Robustness in Right-Censored Longitudinal Models

Sequential double robustness in right-censored longitudinal models , author=. arXiv:1705.02459 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[8]

The Review of Economic Studies , pages=

The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely , author=. The Review of Economic Studies , pages=. 2025 , doi=

2025
[9]

arXiv preprint arXiv:2203.13887 , year=

Automatic debiased machine learning for dynamic treatment effects and general nested functionals , author=. arXiv preprint arXiv:2203.13887 , year=

work page arXiv
[10]

Econometrica , volume=

Automatic debiased machine learning of causal and structural effects , author=. Econometrica , volume=. 2022 , publisher=

2022
[11]

arXiv preprint arXiv:2103.01280 , year=

Dynamic covariate balancing: estimating treatment effects over time with potential local projections , author=. arXiv preprint arXiv:2103.01280 , year=

work page arXiv
[12]

Journal of the American Statistical Association , volume=

Estimation of regression coefficients when some regressors are not always observed , author=. Journal of the American Statistical Association , volume=
[13]

Biometrics , volume=

Doubly robust estimation in missing data and causal inference models , author=. Biometrics , volume=
[14]

Mathematical Modelling , volume=

A new approach to causal inference in mortality studies with a sustained exposure period---application to control of the healthy worker survivor effect , author=. Mathematical Modelling , volume=
[15]

Journal of the American Statistical Association , volume=

Stable weights that balance covariates for estimation with incomplete outcome data , author=. Journal of the American Statistical Association , volume=
[16]

Journal of the Royal Statistical Society: Series B , volume=

Approximate residual balancing: de-biased inference of average treatment effects in high dimensions , author=. Journal of the Royal Statistical Society: Series B , volume=
[17]

The Annals of Statistics , volume=

Augmented minimax linear estimation , author=. The Annals of Statistics , volume=
[18]

Econometrica , volume=

Locally robust semiparametric estimation , author=. Econometrica , volume=
[19]

Statistical Science , volume=

Performance of double-robust estimators when ``inverse probability'' weights are highly variable , author=. Statistical Science , volume=
[20]

Automatic debiased machine learning via

Chernozhukov, Victor and Newey, Whitney K and Quintas-Martinez, Victor and Syrgkanis, Vasilis , journal=. Automatic debiased machine learning via
[21]

arXiv preprint arXiv:2307.04527 , year=

Automatic debiased machine learning for covariate shifts , author=. arXiv preprint arXiv:2307.04527 , year=

work page arXiv
[22]

Randomization analysis of experimental data: the

Rubin, Donald B , journal=. Randomization analysis of experimental data: the
[23]

Kline, Patrick , journal=. Oaxaca-
[24]

Biometrika , volume=

On the implied weights of linear regression for causal inference , author=. Biometrika , volume=
[25]

arXiv:1901.10296 , year=

Minimax linear estimation of the retargeted mean , author=. arXiv:1901.10296 , year=

work page arXiv 1901
[26]

Journal of Machine Learning Research , volume=

Generalized optimal matching methods for causal inference , author=. Journal of Machine Learning Research , volume=
[27]

Journal of Causal Inference , volume=

Optimal balancing of time-dependent confounders for marginal structural models , author=. Journal of Causal Inference , volume=. 2021 , doi=

2021
[28]

Statistical Science , volume=

Identification, inference and sensitivity analysis for causal mediation effects , author=. Statistical Science , volume=
[29]

Biometrika , volume=

Characterization of parameters with a mixed bias property , author=. Biometrika , volume=. 2021 , publisher=

2021
[30]

A note on the relation between one-step, outcome regression and

Rotnitzky, Andrea and Smucler, Ezequiel and Robins, James M , journal=. A note on the relation between one-step, outcome regression and
[31]

On the multiply robust estimation of the mean of the g-functional

On the multiply robust estimation of the mean of the g-functional , author=. arXiv preprint arXiv:1705.08582 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[32]

Statistics in Medicine , volume=

Surrogate endpoints in clinical trials: definition and operational criteria , author=. Statistics in Medicine , volume=
[33]

Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 2001 , pages=

Direct and indirect effects , author=. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 2001 , pages=

2001
[34]

Epidemiology , volume=

Identifiability and exchangeability for direct and indirect effects , author=. Epidemiology , volume=

[1] [1]

Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=

Augmented balancing weights as linear regression , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=. 2025 , publisher=

2025

[2] [2]

Discussion of ``Augmented balancing weights as linear regression'' by

Shen, Zhuoer and Zubizarreta, Jos. Discussion of ``Augmented balancing weights as linear regression'' by. Journal of the Royal Statistical Society Series B: Statistical Methodology , year=

[3] [3]

Discussion of ``Augmented balancing weights as linear regression'' by

Liu, Licheng , journal=. Discussion of ``Augmented balancing weights as linear regression'' by. 2025 , publisher=

2025

[4] [4]

arXiv:2102.11076 , year=

Kernel ridge Riesz representers: Generalization, mis-specification, and the counterfactual effective dimension , author=. arXiv:2102.11076 , year=

work page arXiv

[5] [5]

Causality and psychopathology: Finding the determinants of disorders and their cures , volume=

Alternative graphical causal models and the identification of direct effects , author=. Causality and psychopathology: Finding the determinants of disorders and their cures , volume=. 2010 , publisher=

2010

[6] [6]

Biometrika , volume=

Multiple robustness in factorized likelihood models , author=. Biometrika , volume=. 2017 , publisher=

2017

[7] [7]

Sequential Double Robustness in Right-Censored Longitudinal Models

Sequential double robustness in right-censored longitudinal models , author=. arXiv:1705.02459 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

The Review of Economic Studies , pages=

The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely , author=. The Review of Economic Studies , pages=. 2025 , doi=

2025

[9] [9]

arXiv preprint arXiv:2203.13887 , year=

Automatic debiased machine learning for dynamic treatment effects and general nested functionals , author=. arXiv preprint arXiv:2203.13887 , year=

work page arXiv

[10] [10]

Econometrica , volume=

Automatic debiased machine learning of causal and structural effects , author=. Econometrica , volume=. 2022 , publisher=

2022

[11] [11]

arXiv preprint arXiv:2103.01280 , year=

Dynamic covariate balancing: estimating treatment effects over time with potential local projections , author=. arXiv preprint arXiv:2103.01280 , year=

work page arXiv

[12] [12]

Journal of the American Statistical Association , volume=

Estimation of regression coefficients when some regressors are not always observed , author=. Journal of the American Statistical Association , volume=

[13] [13]

Biometrics , volume=

Doubly robust estimation in missing data and causal inference models , author=. Biometrics , volume=

[14] [14]

Mathematical Modelling , volume=

A new approach to causal inference in mortality studies with a sustained exposure period---application to control of the healthy worker survivor effect , author=. Mathematical Modelling , volume=

[15] [15]

Journal of the American Statistical Association , volume=

Stable weights that balance covariates for estimation with incomplete outcome data , author=. Journal of the American Statistical Association , volume=

[16] [16]

Journal of the Royal Statistical Society: Series B , volume=

Approximate residual balancing: de-biased inference of average treatment effects in high dimensions , author=. Journal of the Royal Statistical Society: Series B , volume=

[17] [17]

The Annals of Statistics , volume=

Augmented minimax linear estimation , author=. The Annals of Statistics , volume=

[18] [18]

Econometrica , volume=

Locally robust semiparametric estimation , author=. Econometrica , volume=

[19] [19]

Statistical Science , volume=

Performance of double-robust estimators when ``inverse probability'' weights are highly variable , author=. Statistical Science , volume=

[20] [20]

Automatic debiased machine learning via

Chernozhukov, Victor and Newey, Whitney K and Quintas-Martinez, Victor and Syrgkanis, Vasilis , journal=. Automatic debiased machine learning via

[21] [21]

arXiv preprint arXiv:2307.04527 , year=

Automatic debiased machine learning for covariate shifts , author=. arXiv preprint arXiv:2307.04527 , year=

work page arXiv

[22] [22]

Randomization analysis of experimental data: the

Rubin, Donald B , journal=. Randomization analysis of experimental data: the

[23] [23]

Kline, Patrick , journal=. Oaxaca-

[24] [24]

Biometrika , volume=

On the implied weights of linear regression for causal inference , author=. Biometrika , volume=

[25] [25]

arXiv:1901.10296 , year=

Minimax linear estimation of the retargeted mean , author=. arXiv:1901.10296 , year=

work page arXiv 1901

[26] [26]

Journal of Machine Learning Research , volume=

Generalized optimal matching methods for causal inference , author=. Journal of Machine Learning Research , volume=

[27] [27]

Journal of Causal Inference , volume=

Optimal balancing of time-dependent confounders for marginal structural models , author=. Journal of Causal Inference , volume=. 2021 , doi=

2021

[28] [28]

Statistical Science , volume=

Identification, inference and sensitivity analysis for causal mediation effects , author=. Statistical Science , volume=

[29] [29]

Biometrika , volume=

Characterization of parameters with a mixed bias property , author=. Biometrika , volume=. 2021 , publisher=

2021

[30] [30]

A note on the relation between one-step, outcome regression and

Rotnitzky, Andrea and Smucler, Ezequiel and Robins, James M , journal=. A note on the relation between one-step, outcome regression and

[31] [31]

On the multiply robust estimation of the mean of the g-functional

On the multiply robust estimation of the mean of the g-functional , author=. arXiv preprint arXiv:1705.08582 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[32] [32]

Statistics in Medicine , volume=

Surrogate endpoints in clinical trials: definition and operational criteria , author=. Statistics in Medicine , volume=

[33] [33]

Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 2001 , pages=

Direct and indirect effects , author=. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 2001 , pages=

2001

[34] [34]

Epidemiology , volume=

Identifiability and exchangeability for direct and indirect effects , author=. Epidemiology , volume=