pith. sign in

arxiv: 1906.12100 · v1 · pith:37YSB7IUnew · submitted 2019-06-28 · 📊 stat.ME

Formulating causal questions and principled statistical answers

Pith reviewed 2026-05-25 14:08 UTC · model grok-4.3

classification 📊 stat.ME
keywords causal inferencepotential outcomespoint exposureno unmeasured confoundinginstrumental variablespropensity scoresimulation study
0
0 comments X

The pith

Causal effects for point exposures are defined using potential outcomes, with estimators grouped by no unmeasured confounding or instrumental variable assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This tutorial explains how to formulate causal questions about a baseline exposure and later outcome by being specific about exposure levels and target populations. It uses the potential outcomes framework to define causal effects and classifies estimation methods into those relying on the no unmeasured confounding assumption, such as outcome regression and propensity score methods, and those using instrumental variables. The paper illustrates these ideas with a simulation learner that generates data mimicking breastfeeding interventions and provides true causal effect values for comparison. R, SAS, and Stata code is made available to implement the approaches.

Core claim

Using the potential outcomes framework, causal effects are defined for specific exposure levels in defined populations, and estimation approaches are classified according to whether they invoke the no unmeasured confounding assumption (including outcome regression and propensity score-based methods) or an instrumental variable with added assumptions.

What carries the argument

The potential outcomes framework, which assigns to each unit the outcome that would be observed under each possible exposure level.

If this is right

  • Causal questions become well-defined when exposure levels and relevant populations are explicitly stated.
  • Outcome regression and propensity score methods apply under the no unmeasured confounding assumption.
  • Instrumental variable methods can be used when that assumption fails but valid instruments exist.
  • Simulation learners with known true effects allow direct evaluation of method performance and pitfalls.
  • Public code enables readers to replicate and adapt the analyses to their data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The emphasis on precise question formulation may reduce misuse of causal methods in applied research.
  • This classification could guide extensions to settings with time-varying exposures or multiple outcomes.
  • Simulation-based teaching tools like the one described might improve training in causal inference.
  • Readers could test the methods on their own observational datasets to see if results align with simulation insights.

Load-bearing premise

That the generated data in the simulation learner sufficiently captures the structure of real observational or randomized studies for the methods and pitfalls to generalize.

What would settle it

An empirical study with known true causal effects from randomization where the no-confounding methods and IV methods yield estimates that contradict the simulation-based expectations for the same data structure.

read the original abstract

Although review papers on causal inference methods are now available, there is a lack of introductory overviews on what they can render and on the guiding criteria for choosing one particular method. This tutorial gives an overview in situations where an exposure of interest is set at a chosen baseline (`point exposure') and the target outcome arises at a later time point. We first phrase relevant causal questions and make a case for being specific about the possible exposure levels involved and the populations for which the question is relevant. Using the potential outcomes framework, we describe principled definitions of causal effects and of estimation approaches classified according to whether they invoke the no unmeasured confounding assumption (including outcome regression and propensity score-based methods) or an instrumental variable with added assumptions. We discuss challenges and potential pitfalls and illustrate application using a `simulation learner', that mimics the effect of various breastfeeding interventions on a child's later development. This involves a typical simulation component with generated exposure, covariate, and outcome data that mimic those from an observational or randomised intervention study. The simulation learner further generates various (linked) exposure types with a set of possible values per observation unit, from which observed as well as potential outcome data are generated. It thus provides true values of several causal effects. R code for data generation and analysis is available on www.ofcaus.org, where SAS and Stata code for analysis is also provided.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper is an expository tutorial on formulating causal questions for point exposures using the potential outcomes framework. It defines causal effects, classifies estimation approaches according to reliance on the no unmeasured confounding (NUC) assumption (outcome regression and propensity-score methods) versus instrumental-variable methods with added assumptions, discusses pitfalls, and illustrates the material via a simulation learner that generates linked exposure types, covariates, and outcomes mimicking a breastfeeding intervention study on child development, with true causal effects available for comparison and R/SAS/Stata code provided.

Significance. If the simulation learner's generative model is representative, the tutorial could provide a useful entry point for applied researchers by linking precise causal question formulation to method selection and by supplying reproducible examples that demonstrate method performance against known truths.

major comments (1)
  1. [Simulation learner] Simulation learner section: the claim that the generated data 'mimic' observational or randomised studies and thereby illustrate broadly applicable pitfalls rests on the unvalidated assumption that the specific confounding structures, positivity properties, and exposure-type linkages are representative; no sensitivity analyses over simulation parameters or external validation against real-study distributions are reported, which is load-bearing for the tutorial's guidance value.
minor comments (2)
  1. [Abstract/Introduction] The abstract and introduction should include a short explicit statement of the target audience (e.g., applied statisticians with basic regression knowledge) to help readers assess fit.
  2. [Potential outcomes framework] Notation for potential outcomes and exposure levels is introduced clearly but would benefit from a single consolidated table of symbols and assumptions early in the text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive assessment and recommendation for minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: [Simulation learner] Simulation learner section: the claim that the generated data 'mimic' observational or randomised studies and thereby illustrate broadly applicable pitfalls rests on the unvalidated assumption that the specific confounding structures, positivity properties, and exposure-type linkages are representative; no sensitivity analyses over simulation parameters or external validation against real-study distributions are reported, which is load-bearing for the tutorial's guidance value.

    Authors: We agree that the simulation learner is constructed from a specific data-generating process chosen to reflect features of a breastfeeding intervention study and that no sensitivity analyses or external validation against real-study distributions are reported. The simulation is presented as an illustrative example that allows readers to compare estimated effects against known truths while demonstrating the formulation of causal questions, the role of the no unmeasured confounding assumption, and common pitfalls such as positivity violations. We will revise the relevant sections to state explicitly that the example is intended to be didactic rather than representative of all observational or randomized settings, and we will add a note directing readers to the provided code so they may alter parameters if desired. revision: yes

Circularity Check

0 steps flagged

Expository tutorial with no derivations or fitted predictions

full rationale

The paper is a review/tutorial that phrases causal questions, defines effects via the potential outcomes framework, classifies estimation approaches (NUC-based vs. IV), and illustrates pitfalls with a simulation learner that generates data and true effects. No equations, parameters, or predictions are fitted to data and then re-presented as independent results. The simulation is described as external code (R/SAS/Stata on ofcaus.org) that provides ground truth by construction, but the paper makes no claim that its own statistical results derive from or reduce to those inputs. No self-citation chains or uniqueness theorems are invoked as load-bearing. The central contribution is expository classification and guidance, which remains independent of the specific simulation example.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

As a tutorial the paper rests on standard statistical assumptions already present in the causal-inference literature rather than introducing new free parameters, axioms, or entities.

axioms (1)
  • domain assumption No unmeasured confounding or instrumental-variable assumptions are sufficient for identification in the settings considered.
    Invoked when classifying estimation approaches in the abstract.

pith-pipeline@v0.9.0 · 5804 in / 1152 out tokens · 21059 ms · 2026-05-25T14:08:15.408821+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 1 internal anchor

  1. [1]

    Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution

    1Pearl J. Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution. Cornell University Library arXiv.org.2018;abs/1801.04016. 2Pearl J. Causal diagrams for empirical research.Biometrika.1995;82(4):669-688. 3Robins J M, Hernán M A, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 200...

  2. [2]

    Marginal structural models to estimate the causal eect of zidovudine on the survival of HIV-positive men.Epidemiology.2000;11:561 -

    4Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal eect of zidovudine on the survival of HIV-positive men.Epidemiology.2000;11:561 -

  3. [3]

    Promotion of breastfeeding intervention trial (PROBIT) - A randomized trial in the Republic of Belarus.Journal of the American Medical Association.2001;285(4):413-420

    5Kramer MS, Chalmers B, Hodnett ED, et al. Promotion of breastfeeding intervention trial (PROBIT) - A randomized trial in the Republic of Belarus.Journal of the American Medical Association.2001;285(4):413-420. 6Neyman J. On the application of probability theory to agricultural experiments. Essay in principles. Section 9 (Translation published in 1990).St...

  4. [4]

    Estimating Causal Eects of Treatments in Randomized and Nonrandomized Studies

    7Rubin DB. Estimating Causal Eects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology.1974;66:688 -

  5. [5]

    Using big data to emulate a target trial when a randomized trial is not available

    GOETGHEBEURET AL 19 8Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. American journal of epidemiology.2016;183(8):758–764. 9Hernán MA, Taubman SL. Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity.2008;32:S8 - S14. 10V...

  6. [6]

    Toward Causal Inference With Interference.Journal of the American Statistical Association

    12Hudgens MG, Halloran ME. Toward Causal Inference With Interference.Journal of the American Statistical Association. 2008;103:832 -

  7. [7]

    The consistency statement in causal inference: A definition or an assumption?.Epidemiology.2009;20:3 -

    13Cole SR, Frangakis C. The consistency statement in causal inference: A definition or an assumption?.Epidemiology.2009;20:3 -

  8. [8]

    Does water kill? A call for less casual causal inferences.Annals of Epidemiology.2016;10:674 -

    14Hernán MA. Does water kill? A call for less casual causal inferences.Annals of Epidemiology.2016;10:674 -

  9. [9]

    Doubly-robust dynamic treatment regimen estimation via weighted least squares

    15Wallace MP, Moodie EEM. Doubly-robust dynamic treatment regimen estimation via weighted least squares. Biometrics. 2015;71(3):636–644. 16Rosenbaum P.R, Rubin DB. The central role of the propensity score in observational studies for causal eects. Biometrika. 1983;70:41 -

  10. [10]

    Random forests.Machine Learning.2001;45(1):5-32

    17Breiman L. Random forests.Machine Learning.2001;45(1):5-32. 18Tan Z. A distributional approach for causal inference using propensity scores.Journal of the American Statistical Association. 2006;101:1619-1637. 19Austin PC. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity- score matched sampl...

  11. [11]

    Reducing bias in observational studies using subclassification on the propensity score

    21Rosenbaum PR, Rubin DB. Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association.1984;79:516-524. 22Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment eects: A comparative study.Statistics in Medicine.2004;23:2937 -

  12. [12]

    Matching methods for causal inference: A review and look forward.Statistical Science.2010;25:1-21

    23Stuart EA. Matching methods for causal inference: A review and look forward.Statistical Science.2010;25:1-21. 24Abadie A, Imbens GW. On the Failure of the Bootstrap for Matching Estimators.Econometrica.2008;76(6):1537-1557. 25Saarela O., Stephens DA, Moodie EEM, Klein M B. On Bayesian estimation of marginal structural models. (With Response to Discussio...

  13. [13]

    A generalization of sampling without replacement from a finite universe.Journal of the American Statistical Association.1952;47:663 -

    26Horvitz DG., Thompson DJ. A generalization of sampling without replacement from a finite universe.Journal of the American Statistical Association.1952;47:663 -

  14. [14]

    Comparison of approaches to weight truncation for marginal structural Cox models

    27Xiao Y, Moodie EEM, Abrahamowicz M. Comparison of approaches to weight truncation for marginal structural Cox models. Epidemiologic Methods.2013;2(1):1–20. 28Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models.Biometrics.2005;61:962 -

  15. [15]

    Adjusting for non-ignorable drop-out using semiparametric non-response model.Journal of the American Statistical Association.1999;94:1096-1120

    29Scharfstein DO Robins JM. Adjusting for non-ignorable drop-out using semiparametric non-response model.Journal of the American Statistical Association.1999;94:1096-1120. 20 GOETGHEBEURET AL 30Angrist J D, Imbens G W, Rubin D B. Identification of causal eects using instrumental variables.Journal of the American Statistical Association.1996;91:444 -

  16. [16]

    Identification and Estimation of Local Average Treatment Eects.Econometrica.1994;62:467–475

    31Imbens G W, Angrist J D. Identification and Estimation of Local Average Treatment Eects.Econometrica.1994;62:467–475. 32Hernán MA, Lanoy E, Costagliola D, Robins JM. Comparison of Dynamic Treatment Regimes via Inverse Probability Weighting.Basic & Clinical Pharmacology & Toxicology.2006;98:237 -

  17. [17]

    Chapman & Hall/CRC; 2018 forthcoming

    33Hernán MA Robins JM.Causal Inference. Chapman & Hall/CRC; 2018 forthcoming. 34Clarke PS, Windmeijer F. Instrumental Variable Estimators for Binary Outcomes. Journal of the American St. 2012;107(500):1638-1652. 35Davies NM, Gunnell D, Thomas KH, Metcalfe C, Windmeijer F, Martin RM. Physicians’ prescribing preferences were a potential instrument for patie...

  18. [18]

    Instrumental variable methods for causal inference.Statistics in medicine.2014;33:2297–

    36Baiocchi M, Cheng J, Small DS. Instrumental variable methods for causal inference.Statistics in medicine.2014;33:2297–

  19. [19]

    Practical properties of some structural mean analyses of the eect of compliance in randomized trials.Controlled Clinical Trials.1999;20(6):531-546

    37Fischer-Lapp K, Goetghebeur E. Practical properties of some structural mean analyses of the eect of compliance in randomized trials.Controlled Clinical Trials.1999;20(6):531-546. 38Fischer K, Goetghebeur E, Vrijens B, White IR. A structural mean model to allow for noncompliance in a randomized trial comparing 2 active treatments.Biostatistics.2011;12(2)...

  20. [20]

    Causal inference with generalized structural mean models.Journal of the Royal Statistical Society, Series B.2003;65:817 -

    49Vansteelandt S, Goetghebeur E. Causal inference with generalized structural mean models.Journal of the Royal Statistical Society, Series B.2003;65:817 -

  21. [21]

    Estimation of treatment eects in randomised trials with non-compliance and a dichotomous outcome using structural mean models.Biometrika.2004;91(4):763–783

    GOETGHEBEURET AL 21 50Robins J, Rotnitzky Andrea. Estimation of treatment eects in randomised trials with non-compliance and a dichotomous outcome using structural mean models.Biometrika.2004;91(4):763–783. 51Vansteelandt S, Bowden J, Babanezhad M, Goetghebeur E. On Instrumental Variables Estimation of Causal Odds Ratios. Statistical Science.2011;26(3):40...

  22. [22]

    Definition and Evaluation of the Monotonicity Condition for Preference- based Instruments.Epidemiology.2015;26(3):414-420

    53Swanson SA, Miller M, Robins JM, Hernan MA. Definition and Evaluation of the Monotonicity Condition for Preference- based Instruments.Epidemiology.2015;26(3):414-420. 54Swanson SA, Hernán MA. The challenging interpretation of instrumental variable estimates under monotonicity.International Journal of Epidemiology.2017;. 55Angrist J, Pischke J-S.Mostly Ha...

  23. [23]

    Doubly robust methods for handling confounding by cluster

    65Zetterqvist J, Vansteelandt S, Pawitan Y, Sjolander A. Doubly robust methods for handling confounding by cluster. Biostatistics.2016;17(2):264-276. 66VanderWeele TJ, Tchetgen EJ Tchetgen, Halloran ME. Interference and Sensitivity Analysis.Statistical Science.2014;29(4, SI):687-706. 67Tchetgen EJ Tchetgen. Identification and estimation of survivor average...

  24. [24]

    73Imai K, Van Dyk. DA. Causal Inference With General Treatment Regimes: Generalizing the Propensity Score.Journal of the American Statistical Association.2004;99:854 -

  25. [25]

    Estimation of dose-response functions for longitudinal data using the Generalized Propensity Score.Statistical Methods in Medical Research.2012;21:148 -

    74Moodie EEM, Stephens DA. Estimation of dose-response functions for longitudinal data using the Generalized Propensity Score.Statistical Methods in Medical Research.2012;21:148 -

  26. [26]

    The eectiveness of right heart catheterization in the initial care of critically ill patients

    75Connors A F, Spero T, Dawson N V , et al. The eectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators..Journal of the American Medical Association.1996;276:889-897. GOETGHEBEURET AL 23 TABLE 1A selection of causal estimands for exposuresA1andA2 Estimand Definition Eect of Programme Oer (a1) ATE1=AT...

  27. [27]

    Similar forYa1(1),a3(1),Ya2(1),a3(1) Results forYa1(0)andYa2(0)are equal, because BEP only aects the outcome if the programme is followed. Results forYa1(0),a3(0),Ya1(1),a3(0),Ya2(1),a3(0)are equal because BEP only aectsYviaA3and duration of breastfeeding, if started The eect of three full months of breastfeeding is not aected by BEP. TABLE 3Consideration...