Double Robust Weighted Regression with Missing Confounders

Hua Shen; Md. Shaddam Hossain Bagmar

arxiv: 2604.20630 · v1 · submitted 2026-04-22 · 📊 stat.ME

Double Robust Weighted Regression with Missing Confounders

Md. Shaddam Hossain Bagmar , Hua Shen This is my paper

Pith reviewed 2026-05-09 23:42 UTC · model grok-4.3

classification 📊 stat.ME

keywords missing confoundersdoubly robust estimationcausal inferencepropensity scoreweighted least squaresobservational studiesmissing data

0 comments

The pith

A new weighted least squares estimator stays consistent for causal effects with missing confounders whenever at least one of the treatment or outcome models is correct.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Missing confounders weaken identification of causal effects and make estimates sensitive to model errors in observational data. Most existing methods within the missing-indicator approach require a single working model to be exactly right and lose consistency otherwise. The paper introduces the MI-WOLS estimator, which builds propensity-score weights into the outcome regression to enforce covariate balance even when some confounders are unobserved. Under the paper's assumptions, this produces double robustness: the estimator recovers the causal effect correctly if either the treatment model or the outcome model is correctly specified. The result gives analysts a practical way to obtain reliable effect estimates without needing every confounder to be fully observed.

Core claim

Within the missing-indicator framework, the MI-WOLS estimator incorporates propensity score based regression weights that satisfy a covariate-balancing condition in the presence of confounder missingness. Under the missingness-strongly-ignorable treatment allocation assumption and assuming either a Conditionally Independent Treatment or Conditionally Independent Outcome structure, the MI-WOLS estimator is consistent when at least the treatment or the outcome model is correctly specified.

What carries the argument

Propensity score based regression weights that enforce covariate balance inside a missing-indicator weighted ordinary least squares regression.

If this is right

Simulation studies show negligible bias, accurate sandwich variance estimates, and near-nominal coverage across varied data-generating processes.
The estimator applies directly to real data such as kidney function outcomes and yields interpretable results.
It supplies a flexible doubly robust option that avoids the single-model fragility of prior missing-indicator methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The weighting construction may reduce the need for complete-case analysis or expensive full imputation in large observational datasets.
Similar balancing weights could be adapted to other regression or matching procedures that currently handle missing confounders only singly robustly.
Performance under high rates of missingness or when both models are mildly misspecified remains an open empirical question.

Load-bearing premise

Missingness is strongly ignorable for treatment allocation and either the treatment assignment or the outcome depends on the observed variables through one of the two specified conditional independence structures.

What would settle it

Finding large finite-sample bias in MI-WOLS estimates when exactly one of the two models is correctly specified, while the missingness-strongly-ignorable assumption and one of the conditional independence structures hold, would falsify the consistency claim.

Figures

Figures reproduced from arXiv: 2604.20630 by Hua Shen, Md. Shaddam Hossain Bagmar.

**Figure 2.** Figure 2: Box plots of estimated treatment effects from the MI-WOLS estimator across different identifiability assumptions, including: (i) when the mSITA assumption holds and does not hold, and (ii) when either the CIT or CIO assumption holds, with a true effect of −2.35. 8 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Box plots of estimated treatment effects from the MI-WOLS estimator across different identifiability assumptions, including: (i) when the mSITA assumption holds and does not hold, and (ii) when both or neither of the CIT and CIO assumptions hold, with a true effect of −2.35. Figures 2 and 3 present boxplots of the MI-WOLS estimator across all simulation scenarios. The x-axis displays the four weighting s… view at source ↗

**Figure 4.** Figure 4: : Bar plots of the ratio of analytical to empirical standard errors (ASE/ [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

read the original abstract

Missing confounders are common in observational studies and present fundamental challenges for causal effect estimation by weakening identification and increasing sensitivity to model misspecification. Within the missing-indicator framework, existing methods rely on a single working model and achieve consistency only when that model is correctly specified, and are therefore singly robust. In this article, we develop a doubly robust missing indicator weighted ordinary least squares (MI-WOLS) estimator with partially observed confounders. The MI-WOLS estimator incorporates the treatment assignment mechanism, commonly known as the propensity score model, into the weighting structure of the outcome regression. Building on the missing-indicator framework, we define propensity score based regression weights that satisfy a covariate-balancing condition in the presence of confounder missingness. Under the missingness-strongly-ignorable treatment allocation assumption and assuming either a Conditionally Independent Treatment or Conditionally Independent Outcome structure, the MI-WOLS estimator is consistent when at least the treatment or the outcome model is correctly specified. Simulation studies support the theoretical robustness of the MI-WOLS estimator, demonstrating negligible bias, accurate sandwich-based variance estimation, and near-nominal coverage probability across a wide range of data-generating scenarios. An illustrative application to kidney function outcomes further demonstrates the interpretability and practical feasibility of the method, offering a flexible, doubly robust alternative to existing singly robust estimators.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a clean doubly robust estimator for causal regression when confounders are missing by combining missing indicators with propensity score weights.

read the letter

The punchline is that the authors built a missing-indicator weighted OLS estimator (MI-WOLS) that stays consistent for the causal effect if either the treatment model or the outcome model is correct, even with partially observed confounders. They do this by folding the propensity score into the weights so the regression satisfies a covariate-balancing condition inside the missing-indicator framework. That is the actual new piece: prior missing-indicator methods were singly robust, and this one adds the second layer of protection without leaving the missing-indicator setup. The theory follows directly from the missingness-strongly-ignorable treatment allocation assumption plus either the CIT or CIO structure, and the stress-test confirms the consistency argument has no gaps. Simulations show low bias, sandwich variance that works, and coverage near nominal across the scenarios they tried. The kidney-function application is a straightforward real-data check that the method runs and gives interpretable numbers. The assumptions are the main soft spot. Missingness-strongly-ignorable treatment allocation plus the CIT/CIO requirement are strong and not always easy to justify in practice; if neither holds the double-robustness guarantee disappears. The paper could also have shown more head-to-head comparisons against multiple imputation or other existing doubly robust approaches under missing confounders, though the current simulations are already supportive. This is a method paper aimed at people who do causal inference on observational data with incomplete covariates. A reader who already works with propensity scores and missing-data problems will get immediate value from the estimator and the proof. It is coherent on its own terms and the evidence matches the claims, so it deserves a serious referee rather than a desk reject.

Referee Report

0 major / 3 minor

Summary. The manuscript develops a doubly robust missing-indicator weighted ordinary least squares (MI-WOLS) estimator for causal effect estimation with partially observed confounders. Within the missing-indicator framework, the estimator incorporates propensity-score-based weights that satisfy a covariate-balancing condition; under the missingness-strongly-ignorable treatment allocation assumption together with either the Conditionally Independent Treatment or Conditionally Independent Outcome structure, the estimator is consistent for the target parameter whenever at least one of the treatment or outcome models is correctly specified. Consistency is supported by simulation studies showing negligible bias, accurate sandwich variance estimates, and near-nominal coverage, plus an illustrative kidney-function application.

Significance. If the double-robustness result holds, the work supplies a practical, flexible alternative to existing singly robust missing-indicator methods, directly addressing sensitivity to model misspecification when confounders are missing. The explicit construction of balancing weights and the provision of reproducible simulation evidence constitute clear strengths.

minor comments (3)

[Abstract] Abstract: the terms 'missingness-strongly-ignorable treatment allocation assumption' and 'Conditionally Independent Treatment or Conditionally Independent Outcome structure' are introduced without even a one-sentence gloss; a brief parenthetical definition or forward reference would improve readability.
The manuscript would benefit from an explicit statement (perhaps as a numbered proposition) of the precise conditions under which the MI-WOLS estimator is consistent, including the role of the missingness indicator in the balancing weights.
Simulation section: the data-generating processes for the CIT and CIO scenarios should be described with sufficient detail (e.g., exact functional forms and parameter values) to allow exact replication of the reported bias and coverage results.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our manuscript and for recommending minor revision. The description of the MI-WOLS estimator, its double robustness under the stated assumptions, and the supporting simulation and application results accurately reflects the contribution. As no specific major comments were raised, we provide no point-by-point responses.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The MI-WOLS estimator is explicitly constructed by embedding the propensity-score model into the weighting structure of the outcome regression under the missing-indicator framework, with the double-robustness consistency result following directly from the stated missingness-strongly-ignorable treatment allocation assumption plus either the CIT or CIO structure. The proof shows unbiasedness of the estimating equation when at least one model is correct, without reducing any claimed prediction or first-principles result to a fitted quantity by construction. No load-bearing self-citations, uniqueness theorems, or ansatzes imported from prior author work appear in the derivation chain; the argument remains independent of the target consistency claim.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The double-robustness property rests on two domain assumptions about missingness and conditional independence structures; no free parameters or invented entities are described in the abstract.

axioms (2)

domain assumption Missingness is strongly ignorable for treatment allocation
This assumption is invoked to support identification of causal effects despite missing confounders.
domain assumption Either Conditionally Independent Treatment or Conditionally Independent Outcome structure holds
This structure is required for the double-robustness consistency result to apply.

pith-pipeline@v0.9.0 · 5529 in / 1450 out tokens · 71537 ms · 2026-05-09T23:42:54.180672+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

[1]

Bagmar, M. S. H. and Shen, H. (2022). Causal inference with missingness in confounder.Journal of Statistical Computation and Simulation, pages 1–14

work page 2022
[2]

E., Shortreed, S

Bian, Z., Moodie, E. E., Shortreed, S. M., Lambert, S. D., and Bhatnagar, S. (2024). Variable selection for individualised treatment rules with discrete outcomes.Journal of the Royal Statistical Society Series C: Applied Statistics, 73(2):298–313

work page 2024
[3]

A., Leyrat, C., Mansfield, K

Blake, H. A., Leyrat, C., Mansfield, K. E., Seaman, S., Tomlinson, L. A., Carpenter, J., and Williamson, E. J. (2020a). Propensity scores using missingness pattern information: a practical guide.Statistics in Medicine, 39(11):1641–1657

work page
[4]

A., Leyrat, C., Mansfield, K

Blake, H. A., Leyrat, C., Mansfield, K. E., Tomlinson, L. A., Carpenter, J., and Williamson, E. J. (2020b). Estimating treatment effects with partially observed covariates using outcome regression with missing indicators.Biometrical Journal, 62(2):428–443

work page
[5]

and Moodie, E

Chakraborty, B. and Moodie, E. E. (2013).Statistical methods for dynamic treatment regimes, volume 2. Springer

work page 2013
[6]

and Murphy, S

Chakraborty, B. and Murphy, S. A. (2014). Dynamic treatment regimes.Annual Review of Statis- tics and Its Application, 1(1):447–464

work page 2014
[7]

J., Linero, A., and Roy, J

Daniels, M. J., Linero, A., and Roy, J. (2023).Bayesian nonparametrics for causal inference and missing data. Chapman and Hall/CRC

work page 2023
[8]

huber sandwich estimator

Freedman, D. A. (2006). On the so-called “huber sandwich estimator” and “robust standard er- rors”.The American Statistician, 60(4):299–302

work page 2006
[9]

J., Westreich, D., Wiesen, C., St ¨urmer, T., Brookhart, M

Funk, M. J., Westreich, D., Wiesen, C., St ¨urmer, T., Brookhart, M. A., and Davidian, M. (2011). Doubly robust estimation of causal effects.American Journal of Epidemiology, 173(7):761–767

work page 2011
[10]

A., Goodman, S

Glass, T. A., Goodman, S. N., Hern´an, M. A., and Samet, J. M. (2013). Causal inference in public health.Annual Review of Public Health, 34(1):61–75

work page 2013
[11]

Jiang, C., Thompson, M., and Wallace, M. (2024). Estimating dynamic treatment regimes for ordinal outcomes with household interference: Application in household smoking cessation.Statis- tical Methods in Medical Research, 33(6):981–995

work page 2024
[12]

Jiang, C., Wallace, M., and Thompson, M. (2022). Doubly-robust dynamic treatment regimen estimation for binary outcomes.arXiv preprint arXiv:2203.08269

work page arXiv 2022
[13]

D., Schafer, J

Kang, J. D., Schafer, J. L., et al. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data.Statistical Science, 22(4):523–539

work page 2007
[14]

Kurz, C. F. (2022). Augmented inverse probability weighting and the double robustness property. Medical Decision Making, 42(2):156–167

work page 2022
[15]

Li, L., Shen, C., Li, X., and Robins, J. M. (2013). On weighting approaches for missing data. Statistical Methods in Medical Research, 22(1):14–30

work page 2013
[16]

Mayer, I., Sverdrup, E., Gauss, T., Moyer, J.-D., Wager, S., and Josse, J. (2020). Doubly robust treatment effect estimation with missing attributes.The Annals of Applied Statistics, 14(3):1409– 1431

work page 2020
[17]

Nadi, A. A. and Wallace, M. (2025). Recent advances in doubly-robust weighted ordinary least squares techniques for dynamic treatment regime estimation.arXiv preprint arXiv:2501.18819

work page arXiv 2025
[18]

B., Mikkelsen, E

Pedersen, A. B., Mikkelsen, E. M., Cronin-Fenton, D., Kristensen, N. R., Pham, T. M., Peder- sen, L., and Petersen, I. (2017). Missing data and multiple imputation in clinical epidemiological research.Clinical Epidemiology, pages 157–166

work page 2017
[19]

Robins, J. M. (1989). The analysis of randomized and non-randomized aids treatment trials using a new approach to causal inference in longitudinal studies.Health Service Research Methodology: a focus on AIDS, pages 113–159

work page 1989
[20]

Robins, J. M. (2000). Marginal structural models versus structural nested models as tools for causal inference. InStatistical Models in Epidemiology, the Environment, and Clinical Trials, pages 95–133. Springer

work page 2000
[21]

and Moodie, E

Schulz, J. and Moodie, E. E. (2021). Doubly robust estimation of optimal dosing strategies. Journal of the American Statistical Association, 116(533):256–268

work page 2021
[22]

E., Nijjar, J

Simoneau, G., Moodie, E. E., Nijjar, J. S., Platt, R. W., Investigators, S. E. R. A. I. C., et al. 18 (2020). Estimating optimal dynamic treatment regimes with survival outcomes.Journal of the American Statistical Association, 115(531):1531–1539

work page 2020
[23]

Tilden, E. L. and Snowden, J. M. (2018). The causal inference framework: a primer on concepts and methods for improving the study of well-woman childbearing processes.Journal of Midwifery and Women’s Health, 63(6):700–709

work page 2018
[24]

and Joffe, M

Vansteelandt, S. and Joffe, M. (2014). Structural nested models and g-estimation: The partially realized promise.Statistical Science, 29(4):707–731

work page 2014
[25]

Wallace, M. P. and Moodie, E. E. (2015). Doubly-robust dynamic treatment regimen estimation via weighted least squares.Biometrics, 71(3):636–644

work page 2015
[26]

A., Cafri, G., Beyrau, K., Nashleanas, M., and Suruki, R

Weaver, J., V oss, E. A., Cafri, G., Beyrau, K., Nashleanas, M., and Suruki, R. (2024). The neces- sity of validity diagnostics when drawing causal inferences from observational data: lessons from a multi-database evaluation of the risk of non-infectious uveitis among patients exposed to remi- cade®.BMC Medical Research Methodology, 24(1):322

work page 2024
[27]

Weisberg, H. I. (2011).Bias and causation: Models and judgment for valid comparisons. John Wiley & Sons

work page 2011
[28]

Zhang, Z., Yi, D., and Fan, Y . (2022). Doubly robust estimation of optimal dynamic treatment regimes with multicategory treatments and survival outcomes.Statistics in Medicine, 41(24):4903– 4923. 19

work page 2022

[1] [1]

Bagmar, M. S. H. and Shen, H. (2022). Causal inference with missingness in confounder.Journal of Statistical Computation and Simulation, pages 1–14

work page 2022

[2] [2]

E., Shortreed, S

Bian, Z., Moodie, E. E., Shortreed, S. M., Lambert, S. D., and Bhatnagar, S. (2024). Variable selection for individualised treatment rules with discrete outcomes.Journal of the Royal Statistical Society Series C: Applied Statistics, 73(2):298–313

work page 2024

[3] [3]

A., Leyrat, C., Mansfield, K

Blake, H. A., Leyrat, C., Mansfield, K. E., Seaman, S., Tomlinson, L. A., Carpenter, J., and Williamson, E. J. (2020a). Propensity scores using missingness pattern information: a practical guide.Statistics in Medicine, 39(11):1641–1657

work page

[4] [4]

A., Leyrat, C., Mansfield, K

Blake, H. A., Leyrat, C., Mansfield, K. E., Tomlinson, L. A., Carpenter, J., and Williamson, E. J. (2020b). Estimating treatment effects with partially observed covariates using outcome regression with missing indicators.Biometrical Journal, 62(2):428–443

work page

[5] [5]

and Moodie, E

Chakraborty, B. and Moodie, E. E. (2013).Statistical methods for dynamic treatment regimes, volume 2. Springer

work page 2013

[6] [6]

and Murphy, S

Chakraborty, B. and Murphy, S. A. (2014). Dynamic treatment regimes.Annual Review of Statis- tics and Its Application, 1(1):447–464

work page 2014

[7] [7]

J., Linero, A., and Roy, J

Daniels, M. J., Linero, A., and Roy, J. (2023).Bayesian nonparametrics for causal inference and missing data. Chapman and Hall/CRC

work page 2023

[8] [8]

huber sandwich estimator

Freedman, D. A. (2006). On the so-called “huber sandwich estimator” and “robust standard er- rors”.The American Statistician, 60(4):299–302

work page 2006

[9] [9]

J., Westreich, D., Wiesen, C., St ¨urmer, T., Brookhart, M

Funk, M. J., Westreich, D., Wiesen, C., St ¨urmer, T., Brookhart, M. A., and Davidian, M. (2011). Doubly robust estimation of causal effects.American Journal of Epidemiology, 173(7):761–767

work page 2011

[10] [10]

A., Goodman, S

Glass, T. A., Goodman, S. N., Hern´an, M. A., and Samet, J. M. (2013). Causal inference in public health.Annual Review of Public Health, 34(1):61–75

work page 2013

[11] [11]

Jiang, C., Thompson, M., and Wallace, M. (2024). Estimating dynamic treatment regimes for ordinal outcomes with household interference: Application in household smoking cessation.Statis- tical Methods in Medical Research, 33(6):981–995

work page 2024

[12] [12]

Jiang, C., Wallace, M., and Thompson, M. (2022). Doubly-robust dynamic treatment regimen estimation for binary outcomes.arXiv preprint arXiv:2203.08269

work page arXiv 2022

[13] [13]

D., Schafer, J

Kang, J. D., Schafer, J. L., et al. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data.Statistical Science, 22(4):523–539

work page 2007

[14] [14]

Kurz, C. F. (2022). Augmented inverse probability weighting and the double robustness property. Medical Decision Making, 42(2):156–167

work page 2022

[15] [15]

Li, L., Shen, C., Li, X., and Robins, J. M. (2013). On weighting approaches for missing data. Statistical Methods in Medical Research, 22(1):14–30

work page 2013

[16] [16]

Mayer, I., Sverdrup, E., Gauss, T., Moyer, J.-D., Wager, S., and Josse, J. (2020). Doubly robust treatment effect estimation with missing attributes.The Annals of Applied Statistics, 14(3):1409– 1431

work page 2020

[17] [17]

Nadi, A. A. and Wallace, M. (2025). Recent advances in doubly-robust weighted ordinary least squares techniques for dynamic treatment regime estimation.arXiv preprint arXiv:2501.18819

work page arXiv 2025

[18] [18]

B., Mikkelsen, E

Pedersen, A. B., Mikkelsen, E. M., Cronin-Fenton, D., Kristensen, N. R., Pham, T. M., Peder- sen, L., and Petersen, I. (2017). Missing data and multiple imputation in clinical epidemiological research.Clinical Epidemiology, pages 157–166

work page 2017

[19] [19]

Robins, J. M. (1989). The analysis of randomized and non-randomized aids treatment trials using a new approach to causal inference in longitudinal studies.Health Service Research Methodology: a focus on AIDS, pages 113–159

work page 1989

[20] [20]

Robins, J. M. (2000). Marginal structural models versus structural nested models as tools for causal inference. InStatistical Models in Epidemiology, the Environment, and Clinical Trials, pages 95–133. Springer

work page 2000

[21] [21]

and Moodie, E

Schulz, J. and Moodie, E. E. (2021). Doubly robust estimation of optimal dosing strategies. Journal of the American Statistical Association, 116(533):256–268

work page 2021

[22] [22]

E., Nijjar, J

Simoneau, G., Moodie, E. E., Nijjar, J. S., Platt, R. W., Investigators, S. E. R. A. I. C., et al. 18 (2020). Estimating optimal dynamic treatment regimes with survival outcomes.Journal of the American Statistical Association, 115(531):1531–1539

work page 2020

[23] [23]

Tilden, E. L. and Snowden, J. M. (2018). The causal inference framework: a primer on concepts and methods for improving the study of well-woman childbearing processes.Journal of Midwifery and Women’s Health, 63(6):700–709

work page 2018

[24] [24]

and Joffe, M

Vansteelandt, S. and Joffe, M. (2014). Structural nested models and g-estimation: The partially realized promise.Statistical Science, 29(4):707–731

work page 2014

[25] [25]

Wallace, M. P. and Moodie, E. E. (2015). Doubly-robust dynamic treatment regimen estimation via weighted least squares.Biometrics, 71(3):636–644

work page 2015

[26] [26]

A., Cafri, G., Beyrau, K., Nashleanas, M., and Suruki, R

Weaver, J., V oss, E. A., Cafri, G., Beyrau, K., Nashleanas, M., and Suruki, R. (2024). The neces- sity of validity diagnostics when drawing causal inferences from observational data: lessons from a multi-database evaluation of the risk of non-infectious uveitis among patients exposed to remi- cade®.BMC Medical Research Methodology, 24(1):322

work page 2024

[27] [27]

Weisberg, H. I. (2011).Bias and causation: Models and judgment for valid comparisons. John Wiley & Sons

work page 2011

[28] [28]

Zhang, Z., Yi, D., and Fan, Y . (2022). Doubly robust estimation of optimal dynamic treatment regimes with multicategory treatments and survival outcomes.Statistics in Medicine, 41(24):4903– 4923. 19

work page 2022