pith. sign in

arxiv: 2606.29009 · v1 · pith:C5UIYMR3new · submitted 2026-06-27 · 📊 stat.ME · econ.EM

Generated outcomes as generated regressors: Equivalences in recursive causal estimation

Pith reviewed 2026-06-30 08:41 UTC · model grok-4.3

classification 📊 stat.ME econ.EM
keywords recursive estimationcausal inferencegenerated regressorstime-varying treatmentsmediation analysisdoubly robust estimationbalancing weightsfinite-sample equivalence
0
0 comments X

The pith

When every stage uses ordinary least squares, three recursive causal estimators coincide in any finite sample.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Time-varying treatment effects, surrogate-identified effects, and mediation effects can each be expressed as chains of regressions in which the fitted values from one stage serve as the outcome or regressor for the next. The paper compares the recursive plug-in estimator, the recursive balancing-weight estimator, and the recursive doubly robust estimator in this setting. When ordinary least squares is applied at every stage, the three estimators produce identical numerical results whether or not the regressions are correctly specified. This establishes a finite-sample equivalence between estimation that recursively regresses generated outcomes and estimation that recursively balances generated regressors. Under ridge penalization the doubly robust version reduces to a backward recursion that blends penalized and OLS fits at each stage, with the OLS weight decaying geometrically as the number of stages grows.

Core claim

Recursive causal estimation for time-varying treatments, surrogates, and mediation is achieved by writing each effect as a sequence of regressions whose outputs serve as inputs to the next. When each stage uses OLS, the recursive plug-in estimator, the recursive balancing estimator, and the recursive doubly robust estimator are numerically identical in every finite sample, correct specification or not. This establishes that estimation via generated outcomes is equivalent to estimation via generated regressors. Under ridge penalties the doubly robust form becomes a geometric blend of penalized and OLS fits whose OLS weight shrinks with the length of the chain. For general convex penalties an

What carries the argument

Recursive regressions in which the predicted values from one stage become the generated outcomes or regressors for the subsequent stage.

If this is right

  • The three recursive estimators are identical under OLS fitting in every finite sample.
  • Regressing generated outcomes is numerically equivalent to balancing generated regressors.
  • Under ridge penalization the doubly robust estimator is a backward recursion of stage-wise blends whose OLS weight decays geometrically with the number of periods.
  • For any convex penalty an identity relating the estimators holds at each stage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The finite-sample identity may allow practitioners to choose the computationally simplest of the three forms without changing the answer.
  • Geometric decay of the OLS weight implies that penalization dominates in very long chains even if the initial stages use OLS.
  • The same equivalence structure could be checked for other loss functions or for non-linear link functions at individual stages.

Load-bearing premise

Time-varying treatment effects, surrogate-identified treatment effects, and mediation effects can all be written as recursive regressions in which each regression's predicted values become generated outcomes for the next regression.

What would settle it

A concrete numerical example in which the OLS-fitted recursive plug-in, balancing, and doubly robust estimators produce different values on the same finite data set drawn from a recursive causal model.

read the original abstract

Time-varying treatment effects, surrogate-identified treatment effects, and mediation effects can all be written as recursive regressions, in which each regression's predicted values become generated outcomes for the next regression. We study how standard causal estimators behave in this setting. Formally, we compare the recursive plug-in, recursive balancing weight, and recursive doubly robust estimators. When every stage is fitted by ordinary least squares (OLS), the three recursive estimators coincide in any finite sample, whether or not the models are correctly specified. As such, estimation by recursively regressing generated outcomes is numerically equivalent to estimation by recursively balancing generated regressors. Under ridge penalisation for the balancing weights, the doubly robust estimator is a backward recursion of stage-wise blends of penalised and OLS regressions. The weight on the recursive OLS regression decays geometrically in the number of time periods. Therefore, the intuition from the cross-sectional setting, where the bias correction moves the estimator towards OLS, applies less and less as the number of time periods increases. For general convex penalties, we derive an identity at each stage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript claims that time-varying treatment effects, surrogate-identified treatment effects, and mediation effects can be expressed as recursive regressions in which each stage's fitted values become generated outcomes or regressors for the next. It compares the recursive plug-in, recursive balancing-weight, and recursive doubly-robust estimators. When every stage is estimated by OLS, the three estimators are numerically identical in any finite sample, whether or not the models are correctly specified. Under ridge penalization of the balancing weights, the doubly-robust estimator equals a backward recursion of stage-wise blends of penalized and OLS fits, with the OLS weight decaying geometrically in the number of periods. An identity is stated for each stage under general convex penalties.

Significance. If the algebraic identities hold, the paper unifies three families of causal estimators in recursive longitudinal and mediation settings by showing that OLS-based recursion renders plug-in and balancing approaches numerically equivalent without any correct-specification requirement. The geometric-decay result under ridge penalization supplies a concrete, testable prediction about how bias-correction strength changes with horizon length. The work is credited for deriving these finite-sample identities directly from the normal equations and convex-optimization properties rather than from probabilistic assumptions.

minor comments (3)
  1. [Abstract] Abstract: the phrase 'an identity at each stage' for general convex penalties is stated without indicating what quantities are being equated; a one-sentence clarification would improve readability.
  2. Notation for the horizon length (T versus stage index k) is used inconsistently across the recursive definitions; a single global symbol would reduce confusion.
  3. [§3] The manuscript invokes the normal equations of OLS at each recursion but does not display the explicit induction step that confirms the generated regressor remains in the column space of the subsequent design matrix; adding this short verification would make the argument self-contained.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript and for recommending minor revision. No specific major comments or criticisms were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central claim is an algebraic identity derived from the normal equations of OLS applied recursively to generated outcomes and regressors. The three estimators coincide in finite samples by direct application of residual orthogonality to the column space at each stage, without any self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. The derivation is self-contained in convex optimization properties and holds regardless of correct specification, matching the default expectation of no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on standard properties of OLS and convex optimization together with the domain assumption that the target causal quantities admit a recursive regression representation. No free parameters, ad-hoc axioms, or new postulated entities are introduced in the abstract.

axioms (2)
  • standard math Ordinary least squares satisfies its normal equations in finite samples regardless of model correctness
    Invoked to establish numerical identity of the three estimators.
  • domain assumption The causal quantities of interest admit an exact recursive regression representation
    Stated explicitly in the first sentence of the abstract as the modeling premise.

pith-pipeline@v0.9.1-grok · 5710 in / 1363 out tokens · 39499 ms · 2026-06-30T08:41:39.281020+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 7 canonical work pages · 2 internal anchors

  1. [1]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=

    Augmented balancing weights as linear regression , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=. 2025 , publisher=

  2. [2]

    Discussion of ``Augmented balancing weights as linear regression'' by

    Shen, Zhuoer and Zubizarreta, Jos. Discussion of ``Augmented balancing weights as linear regression'' by. Journal of the Royal Statistical Society Series B: Statistical Methodology , year=

  3. [3]

    Discussion of ``Augmented balancing weights as linear regression'' by

    Liu, Licheng , journal=. Discussion of ``Augmented balancing weights as linear regression'' by. 2025 , publisher=

  4. [4]

    arXiv:2102.11076 , year=

    Kernel ridge Riesz representers: Generalization, mis-specification, and the counterfactual effective dimension , author=. arXiv:2102.11076 , year=

  5. [5]

    Causality and psychopathology: Finding the determinants of disorders and their cures , volume=

    Alternative graphical causal models and the identification of direct effects , author=. Causality and psychopathology: Finding the determinants of disorders and their cures , volume=. 2010 , publisher=

  6. [6]

    Biometrika , volume=

    Multiple robustness in factorized likelihood models , author=. Biometrika , volume=. 2017 , publisher=

  7. [7]

    Sequential Double Robustness in Right-Censored Longitudinal Models

    Sequential double robustness in right-censored longitudinal models , author=. arXiv:1705.02459 , year=

  8. [8]

    The Review of Economic Studies , pages=

    The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely , author=. The Review of Economic Studies , pages=. 2025 , doi=

  9. [9]

    arXiv preprint arXiv:2203.13887 , year=

    Automatic debiased machine learning for dynamic treatment effects and general nested functionals , author=. arXiv preprint arXiv:2203.13887 , year=

  10. [10]

    Econometrica , volume=

    Automatic debiased machine learning of causal and structural effects , author=. Econometrica , volume=. 2022 , publisher=

  11. [11]

    arXiv preprint arXiv:2103.01280 , year=

    Dynamic covariate balancing: estimating treatment effects over time with potential local projections , author=. arXiv preprint arXiv:2103.01280 , year=

  12. [12]

    Journal of the American Statistical Association , volume=

    Estimation of regression coefficients when some regressors are not always observed , author=. Journal of the American Statistical Association , volume=

  13. [13]

    Biometrics , volume=

    Doubly robust estimation in missing data and causal inference models , author=. Biometrics , volume=

  14. [14]

    Mathematical Modelling , volume=

    A new approach to causal inference in mortality studies with a sustained exposure period---application to control of the healthy worker survivor effect , author=. Mathematical Modelling , volume=

  15. [15]

    Journal of the American Statistical Association , volume=

    Stable weights that balance covariates for estimation with incomplete outcome data , author=. Journal of the American Statistical Association , volume=

  16. [16]

    Journal of the Royal Statistical Society: Series B , volume=

    Approximate residual balancing: de-biased inference of average treatment effects in high dimensions , author=. Journal of the Royal Statistical Society: Series B , volume=

  17. [17]

    The Annals of Statistics , volume=

    Augmented minimax linear estimation , author=. The Annals of Statistics , volume=

  18. [18]

    Econometrica , volume=

    Locally robust semiparametric estimation , author=. Econometrica , volume=

  19. [19]

    Statistical Science , volume=

    Performance of double-robust estimators when ``inverse probability'' weights are highly variable , author=. Statistical Science , volume=

  20. [20]

    Automatic debiased machine learning via

    Chernozhukov, Victor and Newey, Whitney K and Quintas-Martinez, Victor and Syrgkanis, Vasilis , journal=. Automatic debiased machine learning via

  21. [21]

    arXiv preprint arXiv:2307.04527 , year=

    Automatic debiased machine learning for covariate shifts , author=. arXiv preprint arXiv:2307.04527 , year=

  22. [22]

    Randomization analysis of experimental data: the

    Rubin, Donald B , journal=. Randomization analysis of experimental data: the

  23. [23]

    Kline, Patrick , journal=. Oaxaca-

  24. [24]

    Biometrika , volume=

    On the implied weights of linear regression for causal inference , author=. Biometrika , volume=

  25. [25]

    arXiv:1901.10296 , year=

    Minimax linear estimation of the retargeted mean , author=. arXiv:1901.10296 , year=

  26. [26]

    Journal of Machine Learning Research , volume=

    Generalized optimal matching methods for causal inference , author=. Journal of Machine Learning Research , volume=

  27. [27]

    Journal of Causal Inference , volume=

    Optimal balancing of time-dependent confounders for marginal structural models , author=. Journal of Causal Inference , volume=. 2021 , doi=

  28. [28]

    Statistical Science , volume=

    Identification, inference and sensitivity analysis for causal mediation effects , author=. Statistical Science , volume=

  29. [29]

    Biometrika , volume=

    Characterization of parameters with a mixed bias property , author=. Biometrika , volume=. 2021 , publisher=

  30. [30]

    A note on the relation between one-step, outcome regression and

    Rotnitzky, Andrea and Smucler, Ezequiel and Robins, James M , journal=. A note on the relation between one-step, outcome regression and

  31. [31]

    On the multiply robust estimation of the mean of the g-functional

    On the multiply robust estimation of the mean of the g-functional , author=. arXiv preprint arXiv:1705.08582 , year=

  32. [32]

    Statistics in Medicine , volume=

    Surrogate endpoints in clinical trials: definition and operational criteria , author=. Statistics in Medicine , volume=

  33. [33]

    Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 2001 , pages=

    Direct and indirect effects , author=. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 2001 , pages=

  34. [34]

    Epidemiology , volume=

    Identifiability and exchangeability for direct and indirect effects , author=. Epidemiology , volume=