Formulating causal questions and principled statistical answers
Pith reviewed 2026-05-25 14:08 UTC · model grok-4.3
The pith
Causal effects for point exposures are defined using potential outcomes, with estimators grouped by no unmeasured confounding or instrumental variable assumptions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using the potential outcomes framework, causal effects are defined for specific exposure levels in defined populations, and estimation approaches are classified according to whether they invoke the no unmeasured confounding assumption (including outcome regression and propensity score-based methods) or an instrumental variable with added assumptions.
What carries the argument
The potential outcomes framework, which assigns to each unit the outcome that would be observed under each possible exposure level.
If this is right
- Causal questions become well-defined when exposure levels and relevant populations are explicitly stated.
- Outcome regression and propensity score methods apply under the no unmeasured confounding assumption.
- Instrumental variable methods can be used when that assumption fails but valid instruments exist.
- Simulation learners with known true effects allow direct evaluation of method performance and pitfalls.
- Public code enables readers to replicate and adapt the analyses to their data.
Where Pith is reading between the lines
- The emphasis on precise question formulation may reduce misuse of causal methods in applied research.
- This classification could guide extensions to settings with time-varying exposures or multiple outcomes.
- Simulation-based teaching tools like the one described might improve training in causal inference.
- Readers could test the methods on their own observational datasets to see if results align with simulation insights.
Load-bearing premise
That the generated data in the simulation learner sufficiently captures the structure of real observational or randomized studies for the methods and pitfalls to generalize.
What would settle it
An empirical study with known true causal effects from randomization where the no-confounding methods and IV methods yield estimates that contradict the simulation-based expectations for the same data structure.
read the original abstract
Although review papers on causal inference methods are now available, there is a lack of introductory overviews on what they can render and on the guiding criteria for choosing one particular method. This tutorial gives an overview in situations where an exposure of interest is set at a chosen baseline (`point exposure') and the target outcome arises at a later time point. We first phrase relevant causal questions and make a case for being specific about the possible exposure levels involved and the populations for which the question is relevant. Using the potential outcomes framework, we describe principled definitions of causal effects and of estimation approaches classified according to whether they invoke the no unmeasured confounding assumption (including outcome regression and propensity score-based methods) or an instrumental variable with added assumptions. We discuss challenges and potential pitfalls and illustrate application using a `simulation learner', that mimics the effect of various breastfeeding interventions on a child's later development. This involves a typical simulation component with generated exposure, covariate, and outcome data that mimic those from an observational or randomised intervention study. The simulation learner further generates various (linked) exposure types with a set of possible values per observation unit, from which observed as well as potential outcome data are generated. It thus provides true values of several causal effects. R code for data generation and analysis is available on www.ofcaus.org, where SAS and Stata code for analysis is also provided.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper is an expository tutorial on formulating causal questions for point exposures using the potential outcomes framework. It defines causal effects, classifies estimation approaches according to reliance on the no unmeasured confounding (NUC) assumption (outcome regression and propensity-score methods) versus instrumental-variable methods with added assumptions, discusses pitfalls, and illustrates the material via a simulation learner that generates linked exposure types, covariates, and outcomes mimicking a breastfeeding intervention study on child development, with true causal effects available for comparison and R/SAS/Stata code provided.
Significance. If the simulation learner's generative model is representative, the tutorial could provide a useful entry point for applied researchers by linking precise causal question formulation to method selection and by supplying reproducible examples that demonstrate method performance against known truths.
major comments (1)
- [Simulation learner] Simulation learner section: the claim that the generated data 'mimic' observational or randomised studies and thereby illustrate broadly applicable pitfalls rests on the unvalidated assumption that the specific confounding structures, positivity properties, and exposure-type linkages are representative; no sensitivity analyses over simulation parameters or external validation against real-study distributions are reported, which is load-bearing for the tutorial's guidance value.
minor comments (2)
- [Abstract/Introduction] The abstract and introduction should include a short explicit statement of the target audience (e.g., applied statisticians with basic regression knowledge) to help readers assess fit.
- [Potential outcomes framework] Notation for potential outcomes and exposure levels is introduced clearly but would benefit from a single consolidated table of symbols and assumptions early in the text.
Simulated Author's Rebuttal
We thank the referee for their positive assessment and recommendation for minor revision. We address the single major comment below.
read point-by-point responses
-
Referee: [Simulation learner] Simulation learner section: the claim that the generated data 'mimic' observational or randomised studies and thereby illustrate broadly applicable pitfalls rests on the unvalidated assumption that the specific confounding structures, positivity properties, and exposure-type linkages are representative; no sensitivity analyses over simulation parameters or external validation against real-study distributions are reported, which is load-bearing for the tutorial's guidance value.
Authors: We agree that the simulation learner is constructed from a specific data-generating process chosen to reflect features of a breastfeeding intervention study and that no sensitivity analyses or external validation against real-study distributions are reported. The simulation is presented as an illustrative example that allows readers to compare estimated effects against known truths while demonstrating the formulation of causal questions, the role of the no unmeasured confounding assumption, and common pitfalls such as positivity violations. We will revise the relevant sections to state explicitly that the example is intended to be didactic rather than representative of all observational or randomized settings, and we will add a note directing readers to the provided code so they may alter parameters if desired. revision: yes
Circularity Check
Expository tutorial with no derivations or fitted predictions
full rationale
The paper is a review/tutorial that phrases causal questions, defines effects via the potential outcomes framework, classifies estimation approaches (NUC-based vs. IV), and illustrates pitfalls with a simulation learner that generates data and true effects. No equations, parameters, or predictions are fitted to data and then re-presented as independent results. The simulation is described as external code (R/SAS/Stata on ofcaus.org) that provides ground truth by construction, but the paper makes no claim that its own statistical results derive from or reduce to those inputs. No self-citation chains or uniqueness theorems are invoked as load-bearing. The central contribution is expository classification and guidance, which remains independent of the specific simulation example.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption No unmeasured confounding or instrumental-variable assumptions are sufficient for identification in the settings considered.
Reference graph
Works this paper leans on
-
[1]
Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution
1Pearl J. Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution. Cornell University Library arXiv.org.2018;abs/1801.04016. 2Pearl J. Causal diagrams for empirical research.Biometrika.1995;82(4):669-688. 3Robins J M, Hernán M A, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 200...
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[2]
4Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal eect of zidovudine on the survival of HIV-positive men.Epidemiology.2000;11:561 -
work page 2000
-
[3]
5Kramer MS, Chalmers B, Hodnett ED, et al. Promotion of breastfeeding intervention trial (PROBIT) - A randomized trial in the Republic of Belarus.Journal of the American Medical Association.2001;285(4):413-420. 6Neyman J. On the application of probability theory to agricultural experiments. Essay in principles. Section 9 (Translation published in 1990).St...
work page 2001
-
[4]
Estimating Causal Eects of Treatments in Randomized and Nonrandomized Studies
7Rubin DB. Estimating Causal Eects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology.1974;66:688 -
work page 1974
-
[5]
Using big data to emulate a target trial when a randomized trial is not available
GOETGHEBEURET AL 19 8Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. American journal of epidemiology.2016;183(8):758–764. 9Hernán MA, Taubman SL. Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity.2008;32:S8 - S14. 10V...
work page 2016
-
[6]
Toward Causal Inference With Interference.Journal of the American Statistical Association
12Hudgens MG, Halloran ME. Toward Causal Inference With Interference.Journal of the American Statistical Association. 2008;103:832 -
work page 2008
-
[7]
13Cole SR, Frangakis C. The consistency statement in causal inference: A definition or an assumption?.Epidemiology.2009;20:3 -
work page 2009
-
[8]
Does water kill? A call for less casual causal inferences.Annals of Epidemiology.2016;10:674 -
14Hernán MA. Does water kill? A call for less casual causal inferences.Annals of Epidemiology.2016;10:674 -
work page 2016
-
[9]
Doubly-robust dynamic treatment regimen estimation via weighted least squares
15Wallace MP, Moodie EEM. Doubly-robust dynamic treatment regimen estimation via weighted least squares. Biometrics. 2015;71(3):636–644. 16Rosenbaum P.R, Rubin DB. The central role of the propensity score in observational studies for causal eects. Biometrika. 1983;70:41 -
work page 2015
-
[10]
Random forests.Machine Learning.2001;45(1):5-32
17Breiman L. Random forests.Machine Learning.2001;45(1):5-32. 18Tan Z. A distributional approach for causal inference using propensity scores.Journal of the American Statistical Association. 2006;101:1619-1637. 19Austin PC. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity- score matched sampl...
work page 2001
-
[11]
Reducing bias in observational studies using subclassification on the propensity score
21Rosenbaum PR, Rubin DB. Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association.1984;79:516-524. 22Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment eects: A comparative study.Statistics in Medicine.2004;23:2937 -
work page 1984
-
[12]
Matching methods for causal inference: A review and look forward.Statistical Science.2010;25:1-21
23Stuart EA. Matching methods for causal inference: A review and look forward.Statistical Science.2010;25:1-21. 24Abadie A, Imbens GW. On the Failure of the Bootstrap for Matching Estimators.Econometrica.2008;76(6):1537-1557. 25Saarela O., Stephens DA, Moodie EEM, Klein M B. On Bayesian estimation of marginal structural models. (With Response to Discussio...
work page 2010
-
[13]
26Horvitz DG., Thompson DJ. A generalization of sampling without replacement from a finite universe.Journal of the American Statistical Association.1952;47:663 -
work page 1952
-
[14]
Comparison of approaches to weight truncation for marginal structural Cox models
27Xiao Y, Moodie EEM, Abrahamowicz M. Comparison of approaches to weight truncation for marginal structural Cox models. Epidemiologic Methods.2013;2(1):1–20. 28Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models.Biometrics.2005;61:962 -
work page 2013
-
[15]
29Scharfstein DO Robins JM. Adjusting for non-ignorable drop-out using semiparametric non-response model.Journal of the American Statistical Association.1999;94:1096-1120. 20 GOETGHEBEURET AL 30Angrist J D, Imbens G W, Rubin D B. Identification of causal eects using instrumental variables.Journal of the American Statistical Association.1996;91:444 -
work page 1999
-
[16]
Identification and Estimation of Local Average Treatment Eects.Econometrica.1994;62:467–475
31Imbens G W, Angrist J D. Identification and Estimation of Local Average Treatment Eects.Econometrica.1994;62:467–475. 32Hernán MA, Lanoy E, Costagliola D, Robins JM. Comparison of Dynamic Treatment Regimes via Inverse Probability Weighting.Basic & Clinical Pharmacology & Toxicology.2006;98:237 -
work page 1994
-
[17]
Chapman & Hall/CRC; 2018 forthcoming
33Hernán MA Robins JM.Causal Inference. Chapman & Hall/CRC; 2018 forthcoming. 34Clarke PS, Windmeijer F. Instrumental Variable Estimators for Binary Outcomes. Journal of the American St. 2012;107(500):1638-1652. 35Davies NM, Gunnell D, Thomas KH, Metcalfe C, Windmeijer F, Martin RM. Physicians’ prescribing preferences were a potential instrument for patie...
work page 2018
-
[18]
Instrumental variable methods for causal inference.Statistics in medicine.2014;33:2297–
36Baiocchi M, Cheng J, Small DS. Instrumental variable methods for causal inference.Statistics in medicine.2014;33:2297–
work page 2014
-
[19]
37Fischer-Lapp K, Goetghebeur E. Practical properties of some structural mean analyses of the eect of compliance in randomized trials.Controlled Clinical Trials.1999;20(6):531-546. 38Fischer K, Goetghebeur E, Vrijens B, White IR. A structural mean model to allow for noncompliance in a randomized trial comparing 2 active treatments.Biostatistics.2011;12(2)...
work page 1999
-
[20]
49Vansteelandt S, Goetghebeur E. Causal inference with generalized structural mean models.Journal of the Royal Statistical Society, Series B.2003;65:817 -
work page 2003
-
[21]
GOETGHEBEURET AL 21 50Robins J, Rotnitzky Andrea. Estimation of treatment eects in randomised trials with non-compliance and a dichotomous outcome using structural mean models.Biometrika.2004;91(4):763–783. 51Vansteelandt S, Bowden J, Babanezhad M, Goetghebeur E. On Instrumental Variables Estimation of Causal Odds Ratios. Statistical Science.2011;26(3):40...
work page 2004
-
[22]
53Swanson SA, Miller M, Robins JM, Hernan MA. Definition and Evaluation of the Monotonicity Condition for Preference- based Instruments.Epidemiology.2015;26(3):414-420. 54Swanson SA, Hernán MA. The challenging interpretation of instrumental variable estimates under monotonicity.International Journal of Epidemiology.2017;. 55Angrist J, Pischke J-S.Mostly Ha...
work page 2015
-
[23]
Doubly robust methods for handling confounding by cluster
65Zetterqvist J, Vansteelandt S, Pawitan Y, Sjolander A. Doubly robust methods for handling confounding by cluster. Biostatistics.2016;17(2):264-276. 66VanderWeele TJ, Tchetgen EJ Tchetgen, Halloran ME. Interference and Sensitivity Analysis.Statistical Science.2014;29(4, SI):687-706. 67Tchetgen EJ Tchetgen. Identification and estimation of survivor average...
work page 2016
-
[24]
73Imai K, Van Dyk. DA. Causal Inference With General Treatment Regimes: Generalizing the Propensity Score.Journal of the American Statistical Association.2004;99:854 -
work page 2004
-
[25]
74Moodie EEM, Stephens DA. Estimation of dose-response functions for longitudinal data using the Generalized Propensity Score.Statistical Methods in Medical Research.2012;21:148 -
work page 2012
-
[26]
The eectiveness of right heart catheterization in the initial care of critically ill patients
75Connors A F, Spero T, Dawson N V , et al. The eectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators..Journal of the American Medical Association.1996;276:889-897. GOETGHEBEURET AL 23 TABLE 1A selection of causal estimands for exposuresA1andA2 Estimand Definition Eect of Programme Oer (a1) ATE1=AT...
work page 1996
-
[27]
Similar forYa1(1),a3(1),Ya2(1),a3(1) Results forYa1(0)andYa2(0)are equal, because BEP only aects the outcome if the programme is followed. Results forYa1(0),a3(0),Ya1(1),a3(0),Ya2(1),a3(0)are equal because BEP only aectsYviaA3and duration of breastfeeding, if started The eect of three full months of breastfeeding is not aected by BEP. TABLE 3Consideration...
work page 2012
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.