Generalizing causal inferences from randomized trials: counterfactual and graphical identification

Issa J. Dahabreh; James M. Robins; Miguel A. Hern\'an; Sebastien J-P.A. Haneuse

arxiv: 1906.10792 · v1 · pith:ETAEMKAAnew · submitted 2019-06-26 · 📊 stat.ME · stat.AP

Generalizing causal inferences from randomized trials: counterfactual and graphical identification

Issa J. Dahabreh , James M. Robins , Sebastien J-P.A. Haneuse , Miguel A. Hern\'an This is my paper

Pith reviewed 2026-05-25 15:52 UTC · model grok-4.3

classification 📊 stat.ME stat.AP

keywords generalizabilitycausal inferencerandomized trialscounterfactual modelsgraphical modelsg-formulainverse probability weightingtarget population

0 comments

The pith

Counterfactual and graphical models identify conditions for generalizing randomized trial results to target populations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines when average treatment effects estimated in a randomized trial can be extended to the full population of individuals eligible for that trial. It models trial engagement as a factor that may share causes with the outcome or affect the outcome directly. Generalization holds only under specific conditions captured in counterfactual distributions or causal graphs. The authors interpret the required generalization step as the result of a hypothetical intervention that expands trial engagement to everyone eligible. Identification proceeds via g-formula computation or inverse probability weighting and extends to settings with time-varying treatments and censoring.

Core claim

We use counterfactual and graphical causal models to examine under what conditions we can generalize causal inferences from a randomized trial to the target population of trial-eligible individuals. We offer an interpretation of generalizability analyses using the notion of a hypothetical intervention to scale-up trial engagement to the target population. We consider the interpretation of generalizability analyses when trial engagement does or does not directly affect the outcome, highlight connections with censoring in longitudinal studies, and discuss identification of the distribution of counterfactual outcomes via g-formula computation and inverse probability weighting.

What carries the argument

The hypothetical intervention to scale-up trial engagement to the target population, represented in counterfactual outcomes and directed acyclic graphs.

If this is right

Generalization is feasible when trial engagement shares no unmeasured common causes with the outcome and does not directly affect the outcome.
The distribution of counterfactual outcomes under treatment in the target population is identified by the g-formula or by inverse probability weighting.
The same identification strategies apply when extending the methods to time-varying treatments, non-adherence, and censoring.
Connections between trial engagement and censoring allow the framework to address loss to follow-up in longitudinal studies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same scale-up logic could be applied to observational data to assess transportability of effects across populations.
Trial protocols could be redesigned to collect data on factors that drive engagement, making the no-new-confounders assumption easier to check.
Policy decisions that rely on trial results for broad populations would need separate verification that the scale-up assumption holds in practice.

Load-bearing premise

Trial engagement can be conceptualized as a modifiable intervention whose scale-up to the target population does not introduce new unmeasured common causes with the outcome beyond those already represented in the graphs or counterfactuals.

What would settle it

Empirical evidence that expanding trial engagement to the full eligible population creates new unmeasured factors that jointly affect both engagement and the outcome would show the generalization conditions do not hold.

Figures

Figures reproduced from arXiv: 1906.10792 by Issa J. Dahabreh, James M. Robins, Miguel A. Hern\'an, Sebastien J-P.A. Haneuse.

**Figure 4.** Figure 4: SWIG for joint intervention on S and Z, in the absence of trial engagement effects. X R ∣ r=1 S r=1 ∣ s=1 Zr=1,s=1 ∣ z Y z 21 [PITH_FULL_IMAGE:figures/full_fig_p023_4.png] view at source ↗

read the original abstract

When engagement with a randomized trial is driven by factors that affect the outcome or when trial engagement directly affects the outcome independent of treatment, the average treatment effect among trial participants is unlikely to generalize to a target population. In this paper, we use counterfactual and graphical causal models to examine under what conditions we can generalize causal inferences from a randomized trial to the target population of trial-eligible individuals. We offer an interpretation of generalizability analyses using the notion of a hypothetical intervention to "scale-up" trial engagement to the target population. We consider the interpretation of generalizability analyses when trial engagement does or does not directly affect the outcome, highlight connections with censoring in longitudinal studies, and discuss identification of the distribution of counterfactual outcomes via g-formula computation and inverse probability weighting. Last, we show how the methods can be extended to address time-varying treatments, non-adherence, and censoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper recasts generalizability as identifying effects under a scale-up intervention on trial participation, using g-formula and IPW, but the assumption that this intervention adds no new unmeasured confounding paths is the main point that needs checking.

read the letter

The core contribution here is framing generalization from a trial to the target population as the result of a hypothetical intervention that scales trial engagement up to everyone eligible. They then apply standard counterfactual identification via the g-formula and inverse probability weighting, with graphical models to state the conditions. They also note the analogy to censoring and sketch extensions to time-varying treatments and non-adherence. That intervention reading is a useful way to organize the problem and connects cleanly to existing tools without inventing new machinery. The abstract shows they consider both cases where participation does and does not directly affect the outcome, which is a reasonable step. The graphical approach helps make the required assumptions explicit rather than leaving them implicit. The stress-test point about the scale-up potentially opening new confounding paths with the outcome is worth pressing, because if engagement is tied to unmeasured factors that also influence the outcome, the identification formulas could miss those paths even if the measured covariates look sufficient. Without the full derivations and any worked examples in the manuscript, it is hard to see exactly how they close those paths or whether positivity holds under the scale-up. This is a targeted methodological paper for people already comfortable with transportability and g-methods. It organizes existing ideas around a practical problem rather than delivering a large new result, but the framing is clear enough that applied researchers could pick it up. The work is coherent on its own terms and engages the literature honestly, so it deserves a serious referee even if revisions will be needed on the assumption details and examples.

Referee Report

2 major / 3 minor

Summary. The paper develops a framework for generalizing causal effects estimated in a randomized trial to a target population of trial-eligible individuals. It interprets generalizability as the result of a hypothetical intervention that scales trial engagement to the full target population, uses counterfactual and graphical models to characterize identifying conditions (including when engagement affects the outcome), draws parallels to censoring, derives identification via the g-formula and inverse-probability weighting, and extends the approach to time-varying treatments, non-adherence, and censoring.

Significance. If the identification results hold under the stated assumptions, the work supplies a coherent counterfactual-graphical account of generalizability that unifies existing approaches and clarifies the role of trial engagement. The explicit links to censoring and the extensions to longitudinal settings are practically useful for applied researchers.

major comments (2)

[Graphical identification section (around the scale-up intervention)] The central identification claim rests on the assumption that scaling trial engagement does not introduce new unmeasured common causes with the outcome. The manuscript should state the precise graphical or counterfactual conditions that rule out such paths (e.g., in the section presenting the SWIG or the g-formula derivation) and show that they are implied by the trial design and measured covariates.
[Section discussing direct effects of engagement] When trial engagement is allowed to affect the outcome directly, the target quantity is no longer the standard average treatment effect; the paper should supply an explicit expression for the intervened distribution and verify that the g-formula and IPW estimators recover it under the stated assumptions.

minor comments (3)

Add a dedicated table or list that enumerates all identifying assumptions (positivity, consistency, no unmeasured confounding after scaling, etc.) with references to the corresponding equations or graphs.
Clarify notation for the target-population counterfactuals versus the trial-population quantities; inconsistent use of subscripts or superscripts appears in the abstract and early sections.
The connection to censoring is conceptually helpful but would benefit from a short worked numerical example showing how the IPW weights differ from standard censoring weights.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and constructive comments on our manuscript. The suggestions help clarify key identification assumptions and target quantities. We address each major comment below and will make the requested additions in the revised version.

read point-by-point responses

Referee: [Graphical identification section (around the scale-up intervention)] The central identification claim rests on the assumption that scaling trial engagement does not introduce new unmeasured common causes with the outcome. The manuscript should state the precise graphical or counterfactual conditions that rule out such paths (e.g., in the section presenting the SWIG or the g-formula derivation) and show that they are implied by the trial design and measured covariates.

Authors: We agree that the no-new-confounding condition for the scale-up intervention should be stated explicitly. In the revised manuscript we will add, in the graphical identification section immediately following the SWIG presentation, the precise counterfactual assumption: Y(a, e=1) ⊥ E* | L, where E* denotes the hypothetical scaled engagement and L are the measured covariates. We will then show that this independence is implied by (i) randomization of treatment within the trial, (ii) the assumption that all common causes of engagement and outcome are captured in L, and (iii) the trial design ensuring no post-randomization variables open new back-door paths under the scale-up. revision: yes
Referee: [Section discussing direct effects of engagement] When trial engagement is allowed to affect the outcome directly, the target quantity is no longer the standard average treatment effect; the paper should supply an explicit expression for the intervened distribution and verify that the g-formula and IPW estimators recover it under the stated assumptions.

Authors: We will revise the section on direct effects of engagement to supply the explicit target distribution P(Y(a, E*=1)) under the scale-up intervention, distinguishing the case in which engagement has a direct effect on the outcome from the case in which it does not. We will then derive the corresponding g-formula and IPW expressions and verify algebraically that both estimators recover this intervened distribution when the stated conditional exchangeability and positivity assumptions hold (including the version that conditions on engagement when a direct effect is present). revision: yes

Circularity Check

0 steps flagged

No significant circularity; identification rests on standard counterfactual and graphical assumptions

full rationale

The paper derives conditions for generalizing trial results to a target population by defining a hypothetical scale-up intervention on trial engagement and applying standard g-formula and IPW identification results under explicit counterfactual and graphical assumptions. No step equates a claimed prediction or identification result to a fitted parameter or self-referential definition by construction. No load-bearing self-citation chain is invoked to justify uniqueness or an ansatz; the central claims remain independent of the authors' prior work and are falsifiable against external causal identification benchmarks. The derivation chain is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on standard causal inference domain assumptions (consistency, positivity, exchangeability) applied to the new generalizability setting; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Standard causal assumptions including consistency, positivity, and conditional exchangeability hold for both the trial and the target population under the graphical model.
Invoked implicitly when using counterfactuals and graphs for identification of generalized effects.

pith-pipeline@v0.9.0 · 5697 in / 1242 out tokens · 31031 ms · 2026-05-25T15:52:43.939588+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We use counterfactual and graphical causal models to examine under what conditions we can generalize causal inferences from a randomized trial to the target population
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

identification of the distribution of counterfactual outcomes via g-formula computation and inverse probability weighting

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Constructing external comparator groups via transportability in mean or in effect measure
stat.ME 2026-04 unverdicted novelty 6.0

Proposes semiparametric efficient augmented weighting estimators for causal effects under transportability of means or effect measures when appending external comparators to an index trial.
Identification strategies for combining an experimental study with external data
stat.ME 2024-06 unverdicted novelty 4.0

The paper formalizes identification strategies for potential outcome means and average treatment effects when merging experimental studies with external data sources.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · cited by 2 Pith papers

[1]

Using grou p data to treat indi- viduals: understanding heterogeneous treatment eﬀects in the a ge of precision medicine and patient-centred evidence

Issa J Dahabreh, Rodney Hayward, and David M Kent. Using grou p data to treat indi- viduals: understanding heterogeneous treatment eﬀects in the a ge of precision medicine and patient-centred evidence. International Journal of Epidemiology , 45(6):2184–2193, 2016

work page 2016
[2]

Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial

Stephen R Cole and Elizabeth A Stuart. Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial. American Journal of Epidemiology , 172(1):107–115, 2010

work page 2010
[3]

Generalizing from unre presentative exper- iments: a stratiﬁed propensity score approach

Colm O’Muircheartaigh and Larry V Hedges. Generalizing from unre presentative exper- iments: a stratiﬁed propensity score approach. Journal of the Royal Statistical Society. Series C (Applied Statistics) , 63(2):195–210, 2014

work page 2014
[4]

Improving generalizations from experiments us ing propensity score subclassiﬁcation assumptions, properties, and contexts

Elizabeth Tipton. Improving generalizations from experiments us ing propensity score subclassiﬁcation assumptions, properties, and contexts. Journal of Educational and Behavioral Statistics, 38(3):239–266, 2012

work page 2012
[5]

Sample selection in randomized experiments : A new method using propensity score stratiﬁed sampling

Elizabeth Tipton, Larry Hedges, Michael Vaden-Kiernan, Geoﬀr ey Borman, Kate Sul- livan, and Sarah Caverly. Sample selection in randomized experiments : A new method using propensity score stratiﬁed sampling. Journal of Research on Educational Eﬀec- tiveness, 7(1):114–135, 2014

work page 2014
[6]

New metho ds for treatment eﬀect calibration, with applications to non-inferiority trials

Zhiwei Zhang, Lei Nie, Guoxing Soon, and Zonghui Hu. New metho ds for treatment eﬀect calibration, with applications to non-inferiority trials. Biometrics, 72(1):20–29, 2016

work page 2016
[7]

Generalizing evidence from randomized trials using inverse probability of sampling w eights

Ashley L Buchanan, Michael G Hudgens, Stephen R Cole, Katie R Mo llan, Paul E Sax, Eric S Daar, Adaora A Adimora, Joseph J Eron, and Michael J Mugave ro. Generalizing evidence from randomized trials using inverse probability of sampling w eights. Journal of the Royal Statistical Society. Series A (Statistics in So ciety), 181(4):1193–1209, 2018. 22

work page 2018
[8]

Generalizing causal inferences from individua ls in randomized trials to all trial-eligible individuals

Issa J Dahabreh, Sarah E Robertson, Eric J Tchetgen Tchetge n, Elizabeth A Stuart, and Miguel A Hern´ an. Generalizing causal inferences from individua ls in randomized trials to all trial-eligible individuals. Biometrics, 2018

work page 2018
[9]

A multiphase design strategy for dealing with partici- pation bias

Sebastien Haneuse and J Chen. A multiphase design strategy for dealing with partici- pation bias. Biometrics, 67(1):309–318, 2011

work page 2011
[10]

Chris A Rogers, Richard Welbourn, James Byrne, Jenny L Donov an, Barnaby C Reeves, Sarah Wordsworth, Robert Andrews, Janice L Thompson, Paul Ro derick, David Mahon, et al. The by-band study: gastric bypass or adjustable gastric ba nd surgery to treat morbid obesity: study protocol for a multi-centre randomised con trolled trial with an internal pilot phas...

work page 2014
[11]

Perils and potentials of self-selected entry to epidemiological studies and surveys

MA Hern´ an. Discussion of “Perils and potentials of self-selected entry to epidemiological studies and surveys”. Journal of the Royal Statistical Society. Series A (Statist ics in Society), 179(2):346–347, 2016

work page 2016
[12]

trial eﬀect

David A Braunholtz, Sarah JL Edwards, and Richard J Lilford. Ar e randomized clinical trials good for us (in the short term)? Evidence for a “trial eﬀect” . Journal of Clinical Epidemiology, 54(3):217–224, 2001

work page 2001
[13]

Comparison of outcomes in cancer patients treated within and outside clinical tr ials: conceptual framework and structured review

Jeﬀrey M Peppercorn, Jane C Weeks, E Francis Cook, and Stev en Joﬀe. Comparison of outcomes in cancer patients treated within and outside clinical tr ials: conceptual framework and structured review. The Lancet, 363(9405):263–270, 2004

work page 2004
[14]

Hawthorne Revisited: Management and the Worker, Its Critic s, and Developments in Human Relations in Industry

Henry A Landsberger. Hawthorne Revisited: Management and the Worker, Its Critic s, and Developments in Human Relations in Industry. Cornell Studies in Industrial and Labor Relations. Cornell University, Ithaca, NY, 1958

work page 1958
[15]

Randomization and social policy evaluation

James J Heckman. Randomization and social policy evaluation. Te chnical Report 107, National Bureau of Economic Research, Cambridge, Mass., USA, 19 91. 23

work page
[16]

Single world interventio n graphs: a primer

Thomas S Richardson and James M Robins. Single world interventio n graphs: a primer. In Second UAI workshop on causal structure learning, Bellevue , Washington , 2013

work page 2013
[17]

Single world interventio n graphs (SWIGs): A uniﬁcation of the counterfactual and graphical approaches to causality

Thomas S Richardson and James M Robins. Single world interventio n graphs (SWIGs): A uniﬁcation of the counterfactual and graphical approaches to causality. Technical Report 128, Center for Statistics and the Social Sciences, Univer sity of Washington, 2013

work page 2013
[18]

Causality

Judea Pearl. Causality. Cambridge University Press, Cambridge, UK, 2nd edition, 2009

work page 2009
[19]

Causation, prediction, and search

Peter Spirtes, Clark N Glymour, Richard Scheines, David Hecker man, Christopher Meek, Gregory Cooper, and Thomas Richardson. Causation, prediction, and search . MIT press, 2000

work page 2000
[20]

Causal inference withou t counterfactuals: comment

James M Robins and Sander Greenland. Causal inference withou t counterfactuals: comment. Journal of the American Statistical Association , 95(450):431–435, 2000

work page 2000
[21]

Estimating causal eﬀects of treatments in rand omized and nonran- domized studies

Donald B Rubin. Estimating causal eﬀects of treatments in rand omized and nonran- domized studies. Journal of Educational Psychology , 66(5):688, 1974

work page 1974
[22]

D oes informed consent inﬂuence therapeutic outcome? a clinical trial of the hypn otic activity of placebo in patients admitted to hospital

R Dahan, C Caulin, L Figea, JA Kanis, F Caulin, and JM Segrestaa. D oes informed consent inﬂuence therapeutic outcome? a clinical trial of the hypn otic activity of placebo in patients admitted to hospital. Br Med J (Clin Res Ed) , 293(6543):363–364, 1986

work page 1986
[23]

Does water kill? a call for less casual causal in ferences

Miguel A Hern´ an. Does water kill? a call for less casual causal in ferences. Annals of Epidemiology, 26(10):674–680, 2016

work page 2016
[24]

Statistics and causal inference

Paul W Holland. Statistics and causal inference. Journal of the American Statistical Association, 81(396):945–960, 1986

work page 1986
[25]

Do treatment protocols im prove end results? a study of survival of patients with multiple myeloma in ﬁnland

Sakari Karjalainen and Ilmari Palva. Do treatment protocols im prove end results? a study of survival of patients with multiple myeloma in ﬁnland. BMJ, 299(6707):1069– 1072, 1989. 24

work page 1989
[26]

Compound treatments and transportability of causal inference

Miguel A Hern´ an and Tyler J VanderWeele. Compound treatments and transportability of causal inference. Epidemiology (Cambridge, Mass.) , 22(3):368, 2011

work page 2011
[27]

Causal inference (forthcoming)

Miguel A Hern´ an and James M Robins. Causal inference (forthcoming) . Chapman & Hall/CRC, Boca Raton, FL, 2019

work page 2019
[28]

Randomization analysis of experime ntal data: the Fisher randomization test

Donald B Rubin. Discussion of “Randomization analysis of experime ntal data: the Fisher randomization test”. Journal of the American Statistical Association , 75(371):591–593, 1980

work page 1980
[29]

Reﬂections stimulated by the comments of Shadis h (2010) and West and Thoemmes (2010)

Donald B Rubin. Reﬂections stimulated by the comments of Shadis h (2010) and West and Thoemmes (2010). Psychological Methods, 15(1):38–46, 2010

work page 2010
[30]

Concerning the consistency assumption in causal inference

Tyler J VanderWeele. Concerning the consistency assumption in causal inference. Epi- demiology, 20(6):880–883, 2009

work page 2009
[31]

The rela- tionship of surgeon and hospital volume to outcome after gastric b ypass surgery in pennsylvania: a 3-year summary

Anita Courcoulas, Matthew Schuchert, Guido Gatti, and James Luketich. The rela- tionship of surgeon and hospital volume to outcome after gastric b ypass surgery in pennsylvania: a 3-year summary. Surgery, 134(4):613–621, 2003

work page 2003
[32]

Charac- terizing the performance and outcomes of obesity surgery in califo rnia

Jerome H Liu, David Zingmond, David A Etzioni, Jessica B O’Connell, e t al. Charac- terizing the performance and outcomes of obesity surgery in califo rnia. The American Surgeon, 69(10):823, 2003

work page 2003
[33]

The relationship between hospital volume and outcome in bariatric surgery at academic medical centers

Ninh T Nguyen, Mahbod Paya, C Melinda Stevens, Shahrzad Mava ndadi, Kambiz Zain- abadi, and Samuel E Wilson. The relationship between hospital volume and outcome in bariatric surgery at academic medical centers. Annals of Surgery , 240(4):586, 2004

work page 2004
[34]

Causal diagrams for interference

Elizabeth L Ogburn, Tyler J VanderWeele, et al. Causal diagrams for interference. Statistical science, 29(4):559–578, 2014

work page 2014
[35]

Understanding and misun derstanding random- ized controlled trials

Angus Deaton and Nancy Cartwright. Understanding and misun derstanding random- ized controlled trials. Social Science & Medicine (1982) , 210:2–21, 2018. 25

work page 1982
[36]

Using implementation intentions prompts to enhance inﬂuen za vaccination rates

Katherine L Milkman, John Beshears, James J Choi, David Laibson , and Brigitte C Madrian. Using implementation intentions prompts to enhance inﬂuen za vaccination rates. Proceedings of the National Academy of Sciences , 108(26):10415–10420, 2011

work page 2011
[37]

Invited commentary: e very good randomization deserves observation

Daniel Westreich and Jessie K Edwards. Invited commentary: e very good randomization deserves observation. American Journal of Epidemiology , 182(10):857–860, 2015

work page 2015
[38]

Association, causation, and marginal structu ral models

James M Robins. Association, causation, and marginal structu ral models. Synthese, 121(1-2):151–179, 1999

work page 1999
[39]

Marginal struc- tural models and causal inference in epidemiology

James M Robins, Miguel Angel Hern´ an, and Babette Brumback . Marginal struc- tural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass.) , 11(5):550–560, 2000

work page 2000
[40]

Marginal structural models versus structur al nested models as tools for causal inference

James M Robins. Marginal structural models versus structur al nested models as tools for causal inference. In Statistical models in epidemiology, the environment, and c linical trials, pages 95–133. Springer, 2000

work page 2000
[41]

Extending inferences f rom a randomized trial to a target population

Issa J Dahabreh and Miguel A Hern´ an. Extending inferences f rom a randomized trial to a target population. European Journal of Epidemiology , pages 1–4, 2019

work page 2019
[42]

From SATE to PATT: combining experimental with observational studies to est imate population treatment eﬀects

Erin Hartman, Richard Grieve, Roland Ramsahai, and Jasjeet S S ekhon. From SATE to PATT: combining experimental with observational studies to est imate population treatment eﬀects. Journal of the Royal Statistical Society Series A (Statisti cs in Society), 10:1111, 2013

work page 2013
[43]

All generalizations are dangerous, even this o ne

Laura B Balzer. “All generalizations are dangerous, even this o ne.”—Alexandre Dumas. Epidemiology, 28(4):562–566, 2017

work page 2017
[44]

Perils and potentials of self-selec ted entry to epidemiological studies and surveys

Niels Keiding and Thomas A Louis. Perils and potentials of self-selec ted entry to epidemiological studies and surveys. Journal of the Royal Statistical Society. Series A (Statistics in Society) , 179(2):319–376, 2016. 26

work page 2016
[45]

Estimating treatment eﬀect via simple cross desig n synthesis

Eloise E Kaizar. Estimating treatment eﬀect via simple cross desig n synthesis. Statistics in Medicine , 30(25):2986–3009, 2011

work page 2011
[46]

Robust estimation of en couragement design intervention eﬀects transported across sites

Kara E Rudolph and Mark J van der Laan. Robust estimation of en couragement design intervention eﬀects transported across sites. Journal of the Royal Statistical Society. Series B (Statistical Methodology) , 79(5):1509–1525, 2017

work page 2017
[47]

Transportability of causal an d statistical relations: A formal approach

Judea Pearl and Elias Bareinboim. Transportability of causal an d statistical relations: A formal approach. In Data Mining Workshops (ICDMW), 2011 IEEE 11th Internationa l Conference on, pages 540–547. IEEE, 2011

work page 2011
[48]

Transportability of causal eﬀe cts: Completeness results

Elias Bareinboim and Judea Pearl. Transportability of causal eﬀe cts: Completeness results. In AAAI, pages 698–704, 2012

work page 2012
[49]

External validity: from do-ca lculus to transporta- bility across populations

Judea Pearl and Elias Bareinboim. External validity: from do-ca lculus to transporta- bility across populations. Statistical Science, 29(4):579–595, 2014

work page 2014
[50]

Use of electronic healthcare records in large- scale simple randomized trials at the point of care for the documenta tion of value-based medicine

T-P Staa, O Klungel, and L Smeeth. Use of electronic healthcare records in large- scale simple randomized trials at the point of care for the documenta tion of value-based medicine. Journal of Internal Medicine , 275(6):562–569, 2014

work page 2014
[51]

The opportunities and challenges of pragmatic poin t-of-care ran- domised trials using routinely collected electronic records: evaluatio ns of two exemplar trials

Tjeerd-Pieter van Staa, Lisa Dyson, Gerard McCann, Shivani Padmanabhan, Rabah Belatri, Ben Goldacre, Jackie Cassell, Munir Pirmohamed, David Torge rson, Sarah Ronaldson, et al. The opportunities and challenges of pragmatic poin t-of-care ran- domised trials using routinely collected electronic records: evaluatio ns of two exemplar trials. Health Technolog...

work page 2014
[52]

Randomized, controlled trials in health insur ance systems

Niteesh K Choudhry. Randomized, controlled trials in health insur ance systems. New England Journal of Medicine , 377(10):957–964, 2017. 27

work page 2017
[53]

A new approach to causal inference in mortality studies with a sustained exposure period – application to control of the healthy w orker survivor eﬀect

James M Robins. A new approach to causal inference in mortality studies with a sustained exposure period – application to control of the healthy w orker survivor eﬀect. Mathematical Modelling , 7(9):1393–1512, 1986

work page 1986
[54]

intervention nodes

Heejung Bang and James M Robins. Doubly robust estimation in mis sing data and causal inference models. Biometrics, 61(4):962–973, 2005. generalizability conceptual, Date: 27/06/2019 00.45.32 Revision: 31.0 28 Appendix A Brief overview of Single World Interven- tion Graphs (SWIGS) Starting with a causal DAG about the factual (i.e., observable, eve n if un...

work page 2005
[55]

as follows: E/bracketleft.alt1 Pr[ Y ≤y/divides.alt0 X, R = 1, S = 1, Z = z] /bracketright.alt = E/bracketleft.alt4 Pr[ Y ≤y/divides.alt0 X, S = 1, Z = z] Pr[ R = 1, S = 1, Z = z/divides.alt0 X] Pr[ R = 1, S = 1, Z = z/divides.alt0 X] /bracketright.alt4 = E ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ E /bracketleft.alt4 I( Y ≤y, R = 1, S = 1, Z = z) Pr[ R = 1, S = 1/divides.alt0 X] Pr[ ...

work page

[1] [1]

Using grou p data to treat indi- viduals: understanding heterogeneous treatment eﬀects in the a ge of precision medicine and patient-centred evidence

Issa J Dahabreh, Rodney Hayward, and David M Kent. Using grou p data to treat indi- viduals: understanding heterogeneous treatment eﬀects in the a ge of precision medicine and patient-centred evidence. International Journal of Epidemiology , 45(6):2184–2193, 2016

work page 2016

[2] [2]

Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial

Stephen R Cole and Elizabeth A Stuart. Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial. American Journal of Epidemiology , 172(1):107–115, 2010

work page 2010

[3] [3]

Generalizing from unre presentative exper- iments: a stratiﬁed propensity score approach

Colm O’Muircheartaigh and Larry V Hedges. Generalizing from unre presentative exper- iments: a stratiﬁed propensity score approach. Journal of the Royal Statistical Society. Series C (Applied Statistics) , 63(2):195–210, 2014

work page 2014

[4] [4]

Improving generalizations from experiments us ing propensity score subclassiﬁcation assumptions, properties, and contexts

Elizabeth Tipton. Improving generalizations from experiments us ing propensity score subclassiﬁcation assumptions, properties, and contexts. Journal of Educational and Behavioral Statistics, 38(3):239–266, 2012

work page 2012

[5] [5]

Sample selection in randomized experiments : A new method using propensity score stratiﬁed sampling

Elizabeth Tipton, Larry Hedges, Michael Vaden-Kiernan, Geoﬀr ey Borman, Kate Sul- livan, and Sarah Caverly. Sample selection in randomized experiments : A new method using propensity score stratiﬁed sampling. Journal of Research on Educational Eﬀec- tiveness, 7(1):114–135, 2014

work page 2014

[6] [6]

New metho ds for treatment eﬀect calibration, with applications to non-inferiority trials

Zhiwei Zhang, Lei Nie, Guoxing Soon, and Zonghui Hu. New metho ds for treatment eﬀect calibration, with applications to non-inferiority trials. Biometrics, 72(1):20–29, 2016

work page 2016

[7] [7]

Generalizing evidence from randomized trials using inverse probability of sampling w eights

Ashley L Buchanan, Michael G Hudgens, Stephen R Cole, Katie R Mo llan, Paul E Sax, Eric S Daar, Adaora A Adimora, Joseph J Eron, and Michael J Mugave ro. Generalizing evidence from randomized trials using inverse probability of sampling w eights. Journal of the Royal Statistical Society. Series A (Statistics in So ciety), 181(4):1193–1209, 2018. 22

work page 2018

[8] [8]

Generalizing causal inferences from individua ls in randomized trials to all trial-eligible individuals

Issa J Dahabreh, Sarah E Robertson, Eric J Tchetgen Tchetge n, Elizabeth A Stuart, and Miguel A Hern´ an. Generalizing causal inferences from individua ls in randomized trials to all trial-eligible individuals. Biometrics, 2018

work page 2018

[9] [9]

A multiphase design strategy for dealing with partici- pation bias

Sebastien Haneuse and J Chen. A multiphase design strategy for dealing with partici- pation bias. Biometrics, 67(1):309–318, 2011

work page 2011

[10] [10]

Chris A Rogers, Richard Welbourn, James Byrne, Jenny L Donov an, Barnaby C Reeves, Sarah Wordsworth, Robert Andrews, Janice L Thompson, Paul Ro derick, David Mahon, et al. The by-band study: gastric bypass or adjustable gastric ba nd surgery to treat morbid obesity: study protocol for a multi-centre randomised con trolled trial with an internal pilot phas...

work page 2014

[11] [11]

Perils and potentials of self-selected entry to epidemiological studies and surveys

MA Hern´ an. Discussion of “Perils and potentials of self-selected entry to epidemiological studies and surveys”. Journal of the Royal Statistical Society. Series A (Statist ics in Society), 179(2):346–347, 2016

work page 2016

[12] [12]

trial eﬀect

David A Braunholtz, Sarah JL Edwards, and Richard J Lilford. Ar e randomized clinical trials good for us (in the short term)? Evidence for a “trial eﬀect” . Journal of Clinical Epidemiology, 54(3):217–224, 2001

work page 2001

[13] [13]

Comparison of outcomes in cancer patients treated within and outside clinical tr ials: conceptual framework and structured review

Jeﬀrey M Peppercorn, Jane C Weeks, E Francis Cook, and Stev en Joﬀe. Comparison of outcomes in cancer patients treated within and outside clinical tr ials: conceptual framework and structured review. The Lancet, 363(9405):263–270, 2004

work page 2004

[14] [14]

Hawthorne Revisited: Management and the Worker, Its Critic s, and Developments in Human Relations in Industry

Henry A Landsberger. Hawthorne Revisited: Management and the Worker, Its Critic s, and Developments in Human Relations in Industry. Cornell Studies in Industrial and Labor Relations. Cornell University, Ithaca, NY, 1958

work page 1958

[15] [15]

Randomization and social policy evaluation

James J Heckman. Randomization and social policy evaluation. Te chnical Report 107, National Bureau of Economic Research, Cambridge, Mass., USA, 19 91. 23

work page

[16] [16]

Single world interventio n graphs: a primer

Thomas S Richardson and James M Robins. Single world interventio n graphs: a primer. In Second UAI workshop on causal structure learning, Bellevue , Washington , 2013

work page 2013

[17] [17]

Single world interventio n graphs (SWIGs): A uniﬁcation of the counterfactual and graphical approaches to causality

Thomas S Richardson and James M Robins. Single world interventio n graphs (SWIGs): A uniﬁcation of the counterfactual and graphical approaches to causality. Technical Report 128, Center for Statistics and the Social Sciences, Univer sity of Washington, 2013

work page 2013

[18] [18]

Causality

Judea Pearl. Causality. Cambridge University Press, Cambridge, UK, 2nd edition, 2009

work page 2009

[19] [19]

Causation, prediction, and search

Peter Spirtes, Clark N Glymour, Richard Scheines, David Hecker man, Christopher Meek, Gregory Cooper, and Thomas Richardson. Causation, prediction, and search . MIT press, 2000

work page 2000

[20] [20]

Causal inference withou t counterfactuals: comment

James M Robins and Sander Greenland. Causal inference withou t counterfactuals: comment. Journal of the American Statistical Association , 95(450):431–435, 2000

work page 2000

[21] [21]

Estimating causal eﬀects of treatments in rand omized and nonran- domized studies

Donald B Rubin. Estimating causal eﬀects of treatments in rand omized and nonran- domized studies. Journal of Educational Psychology , 66(5):688, 1974

work page 1974

[22] [22]

D oes informed consent inﬂuence therapeutic outcome? a clinical trial of the hypn otic activity of placebo in patients admitted to hospital

R Dahan, C Caulin, L Figea, JA Kanis, F Caulin, and JM Segrestaa. D oes informed consent inﬂuence therapeutic outcome? a clinical trial of the hypn otic activity of placebo in patients admitted to hospital. Br Med J (Clin Res Ed) , 293(6543):363–364, 1986

work page 1986

[23] [23]

Does water kill? a call for less casual causal in ferences

Miguel A Hern´ an. Does water kill? a call for less casual causal in ferences. Annals of Epidemiology, 26(10):674–680, 2016

work page 2016

[24] [24]

Statistics and causal inference

Paul W Holland. Statistics and causal inference. Journal of the American Statistical Association, 81(396):945–960, 1986

work page 1986

[25] [25]

Do treatment protocols im prove end results? a study of survival of patients with multiple myeloma in ﬁnland

Sakari Karjalainen and Ilmari Palva. Do treatment protocols im prove end results? a study of survival of patients with multiple myeloma in ﬁnland. BMJ, 299(6707):1069– 1072, 1989. 24

work page 1989

[26] [26]

Compound treatments and transportability of causal inference

Miguel A Hern´ an and Tyler J VanderWeele. Compound treatments and transportability of causal inference. Epidemiology (Cambridge, Mass.) , 22(3):368, 2011

work page 2011

[27] [27]

Causal inference (forthcoming)

Miguel A Hern´ an and James M Robins. Causal inference (forthcoming) . Chapman & Hall/CRC, Boca Raton, FL, 2019

work page 2019

[28] [28]

Randomization analysis of experime ntal data: the Fisher randomization test

Donald B Rubin. Discussion of “Randomization analysis of experime ntal data: the Fisher randomization test”. Journal of the American Statistical Association , 75(371):591–593, 1980

work page 1980

[29] [29]

Reﬂections stimulated by the comments of Shadis h (2010) and West and Thoemmes (2010)

Donald B Rubin. Reﬂections stimulated by the comments of Shadis h (2010) and West and Thoemmes (2010). Psychological Methods, 15(1):38–46, 2010

work page 2010

[30] [30]

Concerning the consistency assumption in causal inference

Tyler J VanderWeele. Concerning the consistency assumption in causal inference. Epi- demiology, 20(6):880–883, 2009

work page 2009

[31] [31]

The rela- tionship of surgeon and hospital volume to outcome after gastric b ypass surgery in pennsylvania: a 3-year summary

Anita Courcoulas, Matthew Schuchert, Guido Gatti, and James Luketich. The rela- tionship of surgeon and hospital volume to outcome after gastric b ypass surgery in pennsylvania: a 3-year summary. Surgery, 134(4):613–621, 2003

work page 2003

[32] [32]

Charac- terizing the performance and outcomes of obesity surgery in califo rnia

Jerome H Liu, David Zingmond, David A Etzioni, Jessica B O’Connell, e t al. Charac- terizing the performance and outcomes of obesity surgery in califo rnia. The American Surgeon, 69(10):823, 2003

work page 2003

[33] [33]

The relationship between hospital volume and outcome in bariatric surgery at academic medical centers

Ninh T Nguyen, Mahbod Paya, C Melinda Stevens, Shahrzad Mava ndadi, Kambiz Zain- abadi, and Samuel E Wilson. The relationship between hospital volume and outcome in bariatric surgery at academic medical centers. Annals of Surgery , 240(4):586, 2004

work page 2004

[34] [34]

Causal diagrams for interference

Elizabeth L Ogburn, Tyler J VanderWeele, et al. Causal diagrams for interference. Statistical science, 29(4):559–578, 2014

work page 2014

[35] [35]

Understanding and misun derstanding random- ized controlled trials

Angus Deaton and Nancy Cartwright. Understanding and misun derstanding random- ized controlled trials. Social Science & Medicine (1982) , 210:2–21, 2018. 25

work page 1982

[36] [36]

Using implementation intentions prompts to enhance inﬂuen za vaccination rates

Katherine L Milkman, John Beshears, James J Choi, David Laibson , and Brigitte C Madrian. Using implementation intentions prompts to enhance inﬂuen za vaccination rates. Proceedings of the National Academy of Sciences , 108(26):10415–10420, 2011

work page 2011

[37] [37]

Invited commentary: e very good randomization deserves observation

Daniel Westreich and Jessie K Edwards. Invited commentary: e very good randomization deserves observation. American Journal of Epidemiology , 182(10):857–860, 2015

work page 2015

[38] [38]

Association, causation, and marginal structu ral models

James M Robins. Association, causation, and marginal structu ral models. Synthese, 121(1-2):151–179, 1999

work page 1999

[39] [39]

Marginal struc- tural models and causal inference in epidemiology

James M Robins, Miguel Angel Hern´ an, and Babette Brumback . Marginal struc- tural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass.) , 11(5):550–560, 2000

work page 2000

[40] [40]

Marginal structural models versus structur al nested models as tools for causal inference

James M Robins. Marginal structural models versus structur al nested models as tools for causal inference. In Statistical models in epidemiology, the environment, and c linical trials, pages 95–133. Springer, 2000

work page 2000

[41] [41]

Extending inferences f rom a randomized trial to a target population

Issa J Dahabreh and Miguel A Hern´ an. Extending inferences f rom a randomized trial to a target population. European Journal of Epidemiology , pages 1–4, 2019

work page 2019

[42] [42]

From SATE to PATT: combining experimental with observational studies to est imate population treatment eﬀects

Erin Hartman, Richard Grieve, Roland Ramsahai, and Jasjeet S S ekhon. From SATE to PATT: combining experimental with observational studies to est imate population treatment eﬀects. Journal of the Royal Statistical Society Series A (Statisti cs in Society), 10:1111, 2013

work page 2013

[43] [43]

All generalizations are dangerous, even this o ne

Laura B Balzer. “All generalizations are dangerous, even this o ne.”—Alexandre Dumas. Epidemiology, 28(4):562–566, 2017

work page 2017

[44] [44]

Perils and potentials of self-selec ted entry to epidemiological studies and surveys

Niels Keiding and Thomas A Louis. Perils and potentials of self-selec ted entry to epidemiological studies and surveys. Journal of the Royal Statistical Society. Series A (Statistics in Society) , 179(2):319–376, 2016. 26

work page 2016

[45] [45]

Estimating treatment eﬀect via simple cross desig n synthesis

Eloise E Kaizar. Estimating treatment eﬀect via simple cross desig n synthesis. Statistics in Medicine , 30(25):2986–3009, 2011

work page 2011

[46] [46]

Robust estimation of en couragement design intervention eﬀects transported across sites

Kara E Rudolph and Mark J van der Laan. Robust estimation of en couragement design intervention eﬀects transported across sites. Journal of the Royal Statistical Society. Series B (Statistical Methodology) , 79(5):1509–1525, 2017

work page 2017

[47] [47]

Transportability of causal an d statistical relations: A formal approach

Judea Pearl and Elias Bareinboim. Transportability of causal an d statistical relations: A formal approach. In Data Mining Workshops (ICDMW), 2011 IEEE 11th Internationa l Conference on, pages 540–547. IEEE, 2011

work page 2011

[48] [48]

Transportability of causal eﬀe cts: Completeness results

Elias Bareinboim and Judea Pearl. Transportability of causal eﬀe cts: Completeness results. In AAAI, pages 698–704, 2012

work page 2012

[49] [49]

External validity: from do-ca lculus to transporta- bility across populations

Judea Pearl and Elias Bareinboim. External validity: from do-ca lculus to transporta- bility across populations. Statistical Science, 29(4):579–595, 2014

work page 2014

[50] [50]

Use of electronic healthcare records in large- scale simple randomized trials at the point of care for the documenta tion of value-based medicine

T-P Staa, O Klungel, and L Smeeth. Use of electronic healthcare records in large- scale simple randomized trials at the point of care for the documenta tion of value-based medicine. Journal of Internal Medicine , 275(6):562–569, 2014

work page 2014

[51] [51]

The opportunities and challenges of pragmatic poin t-of-care ran- domised trials using routinely collected electronic records: evaluatio ns of two exemplar trials

Tjeerd-Pieter van Staa, Lisa Dyson, Gerard McCann, Shivani Padmanabhan, Rabah Belatri, Ben Goldacre, Jackie Cassell, Munir Pirmohamed, David Torge rson, Sarah Ronaldson, et al. The opportunities and challenges of pragmatic poin t-of-care ran- domised trials using routinely collected electronic records: evaluatio ns of two exemplar trials. Health Technolog...

work page 2014

[52] [52]

Randomized, controlled trials in health insur ance systems

Niteesh K Choudhry. Randomized, controlled trials in health insur ance systems. New England Journal of Medicine , 377(10):957–964, 2017. 27

work page 2017

[53] [53]

A new approach to causal inference in mortality studies with a sustained exposure period – application to control of the healthy w orker survivor eﬀect

James M Robins. A new approach to causal inference in mortality studies with a sustained exposure period – application to control of the healthy w orker survivor eﬀect. Mathematical Modelling , 7(9):1393–1512, 1986

work page 1986

[54] [54]

intervention nodes

Heejung Bang and James M Robins. Doubly robust estimation in mis sing data and causal inference models. Biometrics, 61(4):962–973, 2005. generalizability conceptual, Date: 27/06/2019 00.45.32 Revision: 31.0 28 Appendix A Brief overview of Single World Interven- tion Graphs (SWIGS) Starting with a causal DAG about the factual (i.e., observable, eve n if un...

work page 2005

[55] [55]

as follows: E/bracketleft.alt1 Pr[ Y ≤y/divides.alt0 X, R = 1, S = 1, Z = z] /bracketright.alt = E/bracketleft.alt4 Pr[ Y ≤y/divides.alt0 X, S = 1, Z = z] Pr[ R = 1, S = 1, Z = z/divides.alt0 X] Pr[ R = 1, S = 1, Z = z/divides.alt0 X] /bracketright.alt4 = E ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ E /bracketleft.alt4 I( Y ≤y, R = 1, S = 1, Z = z) Pr[ R = 1, S = 1/divides.alt0 X] Pr[ ...

work page