Heterogeneous Treatment Effects and Causal Mechanisms

Jiawei Fu; Tara Slough

arxiv: 2404.01566 · v4 · submitted 2024-04-02 · 💰 econ.EM · stat.ME

Heterogeneous Treatment Effects and Causal Mechanisms

Jiawei Fu , Tara Slough This is my paper

Pith reviewed 2026-05-24 02:14 UTC · model grok-4.3

classification 💰 econ.EM stat.ME

keywords heterogeneous treatment effectscausal mechanismsexclusion restrictionsidentificationresearch designaverage treatment effectsmechanism evaluation

0 comments

The pith

Detecting heterogeneous treatment effects supports mechanism activation only under exclusion assumptions not guaranteed by standard causal designs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Researchers often look for heterogeneous treatment effects across covariates to determine which mechanisms produce observed causal effects. This paper shows that such detection cannot confirm an active mechanism without additional exclusion restrictions on how covariates relate to the mechanism. These restrictions are not automatically provided by designs that identify average treatment effects. When the restrictions hold, the presence of HTEs indicates an active mechanism, but their absence supplies no information either way. The resulting framework supplies rules for when HTE findings can be interpreted as mechanism evidence and how studies should be designed with this limit in mind.

Core claim

The dominant approach of using heterogeneous treatment effects to evaluate mechanisms cannot provide evidence of mechanism activation without additional, generally implicit, exclusion assumptions. Even when these assumptions are satisfied, the presence of HTEs supports the inference that a mechanism is active but the absence of HTEs is generally uninformative about mechanism activation.

What carries the argument

A framework that connects observed heterogeneous treatment effects to mechanism activation through explicit exclusion restrictions on pre-treatment covariates.

If this is right

Researchers must explicitly justify exclusion restrictions before claiming that HTEs test mechanisms.
Presence of HTEs can confirm an active mechanism once the restrictions are stated and defended.
Absence of HTEs cannot be interpreted as evidence that a mechanism is inactive.
Standard identification strategies for average effects leave mechanism claims under-identified without further assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Existing studies that treat HTE patterns as direct mechanism tests may need to re-evaluate their conclusions under this stricter standard.
New designs could focus on verifying the exclusion restrictions themselves rather than relying on HTE detection alone.
Complementary methods such as direct manipulation of the mechanism or structured mediation tests become more necessary to isolate mechanisms.

Load-bearing premise

The connection between observed heterogeneous treatment effects and mechanism activation depends on exclusion restrictions that standard designs for average treatment effects do not supply.

What would settle it

An empirical demonstration that heterogeneous treatment effects indicate mechanism activation in a design where the required exclusion restrictions are known to be violated would falsify the necessity of those restrictions.

Figures

Figures reproduced from arXiv: 2404.01566 by Jiawei Fu, Tara Slough.

**Figure 1.** Figure 1: Assumption 1 rules out the blue dashed path. Assumption 2 rules out both of the red dot-dashed paths. All black solid paths are permissible under Assumptions 1 and 2. tions. Rather, the difference in CATEs identifies a difference in conditional AIE’s. Thus, identification of this difference is not sufficient to identify indirect effects, as is the goal in (standard) mediation analysis. However, it is stra… view at source ↗

**Figure 2.** Figure 2: The two Panels depict the causal structure of two MDVs for mechanism [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗

**Figure 3.** Figure 3: Four theoretical accounts of how partisan alignment (or bias) and information relate to [PITH_FULL_IMAGE:figures/full_fig_p028_3.png] view at source ↗

read the original abstract

The credibility revolution advances the use of research designs that permit identification and estimation of causal effects. However, understanding which mechanisms produce measured causal effects remains a challenge. The dominant current approach to the quantitative evaluation of mechanisms relies on the detection of heterogeneous treatment effects (HTEs) with respect to pre-treatment covariates. This paper develops a framework to understand when the existence of such heterogeneous treatment effects can support inferences about the activation of a mechanism. We show first that this design cannot provide evidence of mechanism activation without additional, generally implicit, exclusion assumptions. Further, even when these assumptions are satisfied, the presence of HTEs supports the inference that mechanism is active but the absence of HTEs is generally uninformative about mechanism activation. We provide novel guidance for interpretation and research design in light of these findings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows that positive HTEs can support mechanism claims only with extra exclusion restrictions, while null HTEs are generally uninformative even when those hold.

read the letter

The main thing to know is that this paper separates the informational content of positive versus null heterogeneous treatment effects when people try to use them as evidence for mechanisms. Positive HTEs can point to an active mechanism, but only after you add exclusion restrictions that keep other channels from producing the same pattern. A null result on HTEs stays uninformative about whether the mechanism is operating or not. That distinction is the new piece, and it follows directly from the potential-outcomes setup without fitting anything to data. The paper does a clean job laying out why the usual shortcut fails and why the two cases are not symmetric. It gives practical guidance on design and interpretation that matches what applied people actually do. The argument is proportionate: it flags the implicit assumptions rather than claiming the whole literature is broken. One soft spot is that the work stays at the level of interpretive clarification. It does not supply new estimators, bounds, or ways to make the exclusion restrictions testable, so readers still have to judge those restrictions case by case. The logic itself looks solid and does not rely on circular fitting or invented quantities. This is for applied researchers who already use HTEs to talk about mechanisms in their papers. Anyone doing that will get a clearer sense of what their results can and cannot say. It deserves a serious referee because the clarification touches a common design choice across many empirical papers and the reasoning holds up on its own terms. I would send it to review.

Referee Report

2 major / 2 minor

Summary. The paper claims that the dominant approach of using heterogeneous treatment effects (HTEs) with respect to pre-treatment covariates to evaluate causal mechanisms cannot support inferences about mechanism activation without additional, generally implicit exclusion assumptions. Even when those assumptions hold, the presence of HTEs supports mechanism activation but their absence is generally uninformative. The paper develops a potential-outcomes framework to formalize these limits and offers guidance for interpretation and research design.

Significance. If the central logical argument holds, the result is significant for empirical work in economics and related fields that relies on HTE detection to probe mechanisms. It clarifies the gap between standard ATE identification designs and mechanism-specific identification, highlighting the need for explicit exclusion restrictions. The paper's strength lies in its direct derivation from the potential-outcomes framework rather than data-dependent claims.

major comments (2)

[§3] §3, Proposition 1: the formal statement that HTEs imply mechanism activation only under an exclusion restriction (that the covariate affects the outcome only through the mechanism) should be accompanied by an explicit statement of when this restriction is automatically satisfied by standard ATE designs versus when it requires separate justification.
[§4.2] §4.2, the argument that absence of HTEs remains uninformative even under the maintained exclusion restriction: the proof sketch relies on the possibility of offsetting effects across subgroups; a concrete numerical counterexample or additional assumption under which absence would be informative would strengthen the claim.

minor comments (2)

The abstract states the main results clearly but does not reference the specific propositions or theorems that establish them; adding such pointers would improve readability.
[§2] Notation for the exclusion restriction (e.g., the definition of the mechanism-specific potential outcome) is introduced in §2 but used without re-statement in later sections; a brief reminder or table of notation would aid readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the insightful comments. We address each major comment below.

read point-by-point responses

Referee: §3, Proposition 1: the formal statement that HTEs imply mechanism activation only under an exclusion restriction (that the covariate affects the outcome only through the mechanism) should be accompanied by an explicit statement of when this restriction is automatically satisfied by standard ATE designs versus when it requires separate justification.

Authors: We agree that adding an explicit statement distinguishing cases where the exclusion restriction holds automatically under standard ATE designs from those requiring separate justification will clarify the result. We will revise the text following Proposition 1 to include this discussion. revision: yes
Referee: §4.2, the argument that absence of HTEs remains uninformative even under the maintained exclusion restriction: the proof sketch relies on the possibility of offsetting effects across subgroups; a concrete numerical counterexample or additional assumption under which absence would be informative would strengthen the claim.

Authors: We agree that a concrete numerical counterexample would strengthen the exposition of this point. We will add such an example in §4.2 to illustrate offsetting effects across subgroups while preserving the existing proof sketch. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a logical argument within the potential-outcomes framework showing that HTE detection for mechanisms requires additional exclusion restrictions not implied by standard ATE identification. This is a direct statement about identification gaps rather than any derivation that reduces to fitted parameters, self-citations, or renamed inputs. No equations or steps in the provided abstract or reader's summary exhibit self-definitional, fitted-prediction, or load-bearing self-citation patterns. The central claim is self-contained as a clarification of existing identification limits.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper builds on the standard identification assumptions of the credibility revolution in causal inference and introduces the need for additional exclusion assumptions that are not part of those standard assumptions.

axioms (1)

domain assumption Standard causal identification assumptions for treatment effects under research designs that permit identification
The paper positions its contribution against the credibility revolution designs that already identify average effects.

pith-pipeline@v0.9.0 · 5653 in / 1030 out tokens · 22409 ms · 2026-05-24T02:14:55.650261+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 1 internal anchor

[1]

F., Ko c ak, K., & Magazinnik, A

Abramson, S. F., Ko c ak, K., & Magazinnik, A. (2022). What do we learn about voter preferences from conjoint experiments? American Journal of Political Science , 66(4), 1008--1020

work page 2022
[2]

Acharya, A., Blackwell, M., & Sen, M. (2016). Explaining causal findings without bias: Detecting and assessing direct effects. American Political Science Review , 110(3), 512--529

work page 2016
[3]

Anduiza, E., Gallego, A., & Mu \ n oz, J. (2013). Turning a blind eye: Experimental evidence of partisan bias in attitudes toward corruption. Comparative Political Studies , 46(12), 1664--1692

work page 2013
[4]

Arias, E., Bal\' a n, P., Larreguy, H., Marshall, J., & Querub\' i n, P. (2019). Information provision, voter coordination, and electoral accountability: Evidence from mexican social networks. American Political Science Review , 113(2), 475--498

work page 2019
[5]

R., & Bueno de Mesquita, E

Ashworth, S., Berry, C. R., & Bueno de Mesquita, E. (2021). Theory and Credibility: Integrating Theoretical and Empirical Social Science . Princeton University Press

work page 2021
[6]

R., & de Mesquita, E

Ashworth, S., Berry, C. R., & de Mesquita, E. B. (2023). Modeling theories of women's underrepresentation in elections. American Journal of Political Science , Early View

work page 2023
[7]

Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forests. Annals of Statistics , 47(2), 1148--1178

work page 2019
[8]

& Wager, S

Athey, S. & Wager, S. (2021). Policy learning with observational data. Econometrica , 89(1), 133--161

work page 2021
[9]

D., DeMeritt, J

Berry, W. D., DeMeritt, J. H. R., & Esarey, J. (2009). Testing for interaction in binary logit and probit models: Is a product term essential? American Journal of Political Science , 54(1), 248--266

work page 2009
[10]

Blackwell, M., Ma, R., & Opacic, A. (2024). Assumption smuggling in intermediate outcome tests of causal mechanisms assumption smuggling in intermediate outcome tests of causal mechanisms. Working paper available at arXiv:2407.07072v2

work page arXiv 2024
[11]

R., & Golder, M

Brambor, T., Clark, W. R., & Golder, M. (2006). Understanding interaction models: Improving empirical analyses. Political Analysis , 14(1), 63--82

work page 2006
[12]

& Tyson, S

Bueno de Mesquita, E. & Tyson, S. A. (2020). The commensurability problem: Conceptual difficulties in estimating the effect of behavior on behavior. American Political Science Review , 114(2), 375--391

work page 2020
[13]

Bullock, J. G. & Green, D. P. (2021). The failings of conventional mediation analysis and a design-based alternative. Advances in Methods and Practices in Psychological Science , 4(4), 1--18

work page 2021
[14]

F., Hidalgo, F., & Kasahara, Y

de Figueiredo , M. F., Hidalgo, F., & Kasahara, Y. (2023). When do voters punish corrupt politicians? experimental evidence from a field and survey experiment. British Journal of Political Science , 53, 728--739

work page 2023
[15]

& Egami, N

Devaux, M. & Egami, N. (2022). Quantifying robustness to external validity bias. Working paper available at https://naokiegami.com/paper/external_robust.pdf

work page 2022
[16]

D., McIntosh, C., & Nellis, G., Eds

Dunning, T., Grossman, G., Humphreys, M., Hyde, S. D., McIntosh, C., & Nellis, G., Eds. (2019). Information, Accountability, and Cumulative Learning: Lessons from Metaketa I . New York: Cambridge University Press

work page 2019
[17]

& Hartman, E

Egami, N. & Hartman, E. (2022). Elements of external validity: Framework, design, and analysis. American Political Science Review , Forthcoming

work page 2022
[18]

Eggers, A. C. (2014). Partisanship and electoral accountability: Evidence from the uk expenses scandal. Quarterly Journal of Political Science , 9, 441--472

work page 2014
[19]

& Finan, F

Ferraz, C. & Finan, F. (2008). Exposing corrupt politicians: The effects of brazil's publicly released audits on electoral outcomes. Quarterly Journal of Economics , 123(2), 703--745

work page 2008
[20]

Fink, G., McConnell, M., & Vollmer, S. (2014). Testing for heterogeneous treatment effects in experimental data: falsediscovery risks and correction procedures. Journal of Development Effectiveness , 6(1), 44--57

work page 2014
[21]

Fu, J. (2024). Extracting mechanisms from heterogeneous effects: An identification strategy for mediation analysis. arXiv preprint arXiv:2403.04131

work page internal anchor Pith review Pith/arXiv arXiv 2024
[22]

Gerber, A. S. & Green, D. P. (2012). Field Experiments: Design, Analysis, and Interpretation . New York: W.W. Norton

work page 2012
[23]

Grimmer, J., Messing, S., & Westwood, S. J. (2017). Estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. Political Analysis , 25(4), 413--434

work page 2017
[24]

Haim, D., Ravanilla, N., & Sexton, R. (2021). Sustained government engagement improves subsequent pandeic risk reporting in conflict zones. American Political Science Review , 115(2), 717--724

work page 2021
[25]

Hainmueller, J., Mummolo, J., & Xu, Y. (2018). How much should we trust estimates from multiplicative interaction models? simple tools to improve empirical practice. Political Analysis , 27(2), 163--192

work page 2018
[26]

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association , 81(396), 945--960

work page 1986
[27]

Huang, M. Y. (2024). Sensitivity analysis for the generalization of experimental results. Journal of the Royal Statistical Society Series A: Statistics in Society , 187(4), 900--918

work page 2024
[28]

Imai, K., Keele, L., & Tingley, D. (2010a). A general approach to causal mediation analysis. Psychological methods , 15(4), 309

work page
[29]

Imai, K., Keele, L., Tingley, D., & Yamamoto, T. (2011). Unpacking the black box of causality: Learning about causal mechanisms from experimental and observational studies. American Political Science Review , 105(4), 765--789

work page 2011
[30]

Imai, K., Keele, L., & Yamamoto, T. (2010b). Identification, inference and sensitivity analysis for causal mediation effects. Statistical science , 25(1), 51--71

work page
[31]

& Yamamoto, T

Imai, K. & Yamamoto, T. (2013). Identification and sensitivity analysis for multiple causal mechanisms: Revisiting evidence from framing experiments. Political Analysis , 21(2), 141--171

work page 2013
[32]

Incerti, T. (2020). Corruption information and vote share: A meta-analysis and lessons for experimental design. American Political Science Review , 114(3), 761--774

work page 2020
[33]

& Tetenov, A

Kitagawa, T. & Tetenov, A. (2018). Who should be treated? empirical welfare maximization methods for treatment choice. Econometrica , 86(2), 591--616

work page 2018
[34]

& Shaikh, A

Lee, S. & Shaikh, A. M. (2014). Multiple testing and heterogeneous treatment effects: Re-evaluating the effect of progresa on school enrollment. Journal of Applied Econometrics , 29, 612--626

work page 2014
[35]

T., Schnakenberg, K

Little, A. T., Schnakenberg, K. E., & Turner, I. R. (2022). Motivated reasoning and democratic accountability. American Political Science Review , 116(2), 751--767

work page 2022
[36]

Manski, C. F. (1997). Monotone treatment response. Econometrica , 65(6), 1311--1334

work page 1997
[37]

McClelland, G. H. & Judd, C. M. (1993). Statistical difficulties of detecting interactions and moderator effects. Psychological Bulletin , 114(2), 376--390

work page 1993
[38]

Moscowitz, D. (2021). Local news, information, and the nationalization of u.s. elections. American Political Science Review , 115(1), 114--129

work page 2021
[39]

Neyman, J. (1923). Sur les applications de la theorie des probabilites aux experiences agricoles: essai des principes (masters thesis); justification of applications of the calculus of probabilities to the solutions of certain questions in agricultural experimentation. excerpts english translation (reprinted). Statistical Science , 5, 463--472

work page 1923
[40]

o mberg, U., & Bj\

Nilsson, A., Bonander, C., Str\" o mberg, U., & Bj\" o rk, J. (2021). A directed acyclic graph for interactions. International Journal of Epidemiology , 50(2), 613--619

work page 2021
[41]

Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology , 66(5), 688

work page 1974
[42]

Slough, T. (2023). Phantom counterfactuals. American Journal of Political Science , 67(1), 137--153

work page 2023
[43]

Slough, T. (2024). Bureaucratic quality and electoral accountability. American Political Science Review , 118(4), 1931--1950

work page 2024
[44]

& Tyson, S

Slough, T. & Tyson, S. A. (2023). External validity and meta-analysis. American Journal of Political Science , 67(2), 440--455

work page 2023
[45]

& Tyson, S

Slough, T. & Tyson, S. A. (2024). External Validity and Evidence Accumulation . New York: Cambridge University Press

work page 2024
[46]

& Tyson, S

Slough, T. & Tyson, S. A. (2025). Sign-congruence, external validity, and replication. Political Analysis , 33(3), 195--210

work page 2025
[47]

Weinberg, C. R. (2007). Can dags clarify effect moderation? Epidemiology , 18(5), 569--572

work page 2007

[1] [1]

F., Ko c ak, K., & Magazinnik, A

Abramson, S. F., Ko c ak, K., & Magazinnik, A. (2022). What do we learn about voter preferences from conjoint experiments? American Journal of Political Science , 66(4), 1008--1020

work page 2022

[2] [2]

Acharya, A., Blackwell, M., & Sen, M. (2016). Explaining causal findings without bias: Detecting and assessing direct effects. American Political Science Review , 110(3), 512--529

work page 2016

[3] [3]

Anduiza, E., Gallego, A., & Mu \ n oz, J. (2013). Turning a blind eye: Experimental evidence of partisan bias in attitudes toward corruption. Comparative Political Studies , 46(12), 1664--1692

work page 2013

[4] [4]

Arias, E., Bal\' a n, P., Larreguy, H., Marshall, J., & Querub\' i n, P. (2019). Information provision, voter coordination, and electoral accountability: Evidence from mexican social networks. American Political Science Review , 113(2), 475--498

work page 2019

[5] [5]

R., & Bueno de Mesquita, E

Ashworth, S., Berry, C. R., & Bueno de Mesquita, E. (2021). Theory and Credibility: Integrating Theoretical and Empirical Social Science . Princeton University Press

work page 2021

[6] [6]

R., & de Mesquita, E

Ashworth, S., Berry, C. R., & de Mesquita, E. B. (2023). Modeling theories of women's underrepresentation in elections. American Journal of Political Science , Early View

work page 2023

[7] [7]

Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forests. Annals of Statistics , 47(2), 1148--1178

work page 2019

[8] [8]

& Wager, S

Athey, S. & Wager, S. (2021). Policy learning with observational data. Econometrica , 89(1), 133--161

work page 2021

[9] [9]

D., DeMeritt, J

Berry, W. D., DeMeritt, J. H. R., & Esarey, J. (2009). Testing for interaction in binary logit and probit models: Is a product term essential? American Journal of Political Science , 54(1), 248--266

work page 2009

[10] [10]

Blackwell, M., Ma, R., & Opacic, A. (2024). Assumption smuggling in intermediate outcome tests of causal mechanisms assumption smuggling in intermediate outcome tests of causal mechanisms. Working paper available at arXiv:2407.07072v2

work page arXiv 2024

[11] [11]

R., & Golder, M

Brambor, T., Clark, W. R., & Golder, M. (2006). Understanding interaction models: Improving empirical analyses. Political Analysis , 14(1), 63--82

work page 2006

[12] [12]

& Tyson, S

Bueno de Mesquita, E. & Tyson, S. A. (2020). The commensurability problem: Conceptual difficulties in estimating the effect of behavior on behavior. American Political Science Review , 114(2), 375--391

work page 2020

[13] [13]

Bullock, J. G. & Green, D. P. (2021). The failings of conventional mediation analysis and a design-based alternative. Advances in Methods and Practices in Psychological Science , 4(4), 1--18

work page 2021

[14] [14]

F., Hidalgo, F., & Kasahara, Y

de Figueiredo , M. F., Hidalgo, F., & Kasahara, Y. (2023). When do voters punish corrupt politicians? experimental evidence from a field and survey experiment. British Journal of Political Science , 53, 728--739

work page 2023

[15] [15]

& Egami, N

Devaux, M. & Egami, N. (2022). Quantifying robustness to external validity bias. Working paper available at https://naokiegami.com/paper/external_robust.pdf

work page 2022

[16] [16]

D., McIntosh, C., & Nellis, G., Eds

Dunning, T., Grossman, G., Humphreys, M., Hyde, S. D., McIntosh, C., & Nellis, G., Eds. (2019). Information, Accountability, and Cumulative Learning: Lessons from Metaketa I . New York: Cambridge University Press

work page 2019

[17] [17]

& Hartman, E

Egami, N. & Hartman, E. (2022). Elements of external validity: Framework, design, and analysis. American Political Science Review , Forthcoming

work page 2022

[18] [18]

Eggers, A. C. (2014). Partisanship and electoral accountability: Evidence from the uk expenses scandal. Quarterly Journal of Political Science , 9, 441--472

work page 2014

[19] [19]

& Finan, F

Ferraz, C. & Finan, F. (2008). Exposing corrupt politicians: The effects of brazil's publicly released audits on electoral outcomes. Quarterly Journal of Economics , 123(2), 703--745

work page 2008

[20] [20]

Fink, G., McConnell, M., & Vollmer, S. (2014). Testing for heterogeneous treatment effects in experimental data: falsediscovery risks and correction procedures. Journal of Development Effectiveness , 6(1), 44--57

work page 2014

[21] [21]

Fu, J. (2024). Extracting mechanisms from heterogeneous effects: An identification strategy for mediation analysis. arXiv preprint arXiv:2403.04131

work page internal anchor Pith review Pith/arXiv arXiv 2024

[22] [22]

Gerber, A. S. & Green, D. P. (2012). Field Experiments: Design, Analysis, and Interpretation . New York: W.W. Norton

work page 2012

[23] [23]

Grimmer, J., Messing, S., & Westwood, S. J. (2017). Estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. Political Analysis , 25(4), 413--434

work page 2017

[24] [24]

Haim, D., Ravanilla, N., & Sexton, R. (2021). Sustained government engagement improves subsequent pandeic risk reporting in conflict zones. American Political Science Review , 115(2), 717--724

work page 2021

[25] [25]

Hainmueller, J., Mummolo, J., & Xu, Y. (2018). How much should we trust estimates from multiplicative interaction models? simple tools to improve empirical practice. Political Analysis , 27(2), 163--192

work page 2018

[26] [26]

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association , 81(396), 945--960

work page 1986

[27] [27]

Huang, M. Y. (2024). Sensitivity analysis for the generalization of experimental results. Journal of the Royal Statistical Society Series A: Statistics in Society , 187(4), 900--918

work page 2024

[28] [28]

Imai, K., Keele, L., & Tingley, D. (2010a). A general approach to causal mediation analysis. Psychological methods , 15(4), 309

work page

[29] [29]

Imai, K., Keele, L., Tingley, D., & Yamamoto, T. (2011). Unpacking the black box of causality: Learning about causal mechanisms from experimental and observational studies. American Political Science Review , 105(4), 765--789

work page 2011

[30] [30]

Imai, K., Keele, L., & Yamamoto, T. (2010b). Identification, inference and sensitivity analysis for causal mediation effects. Statistical science , 25(1), 51--71

work page

[31] [31]

& Yamamoto, T

Imai, K. & Yamamoto, T. (2013). Identification and sensitivity analysis for multiple causal mechanisms: Revisiting evidence from framing experiments. Political Analysis , 21(2), 141--171

work page 2013

[32] [32]

Incerti, T. (2020). Corruption information and vote share: A meta-analysis and lessons for experimental design. American Political Science Review , 114(3), 761--774

work page 2020

[33] [33]

& Tetenov, A

Kitagawa, T. & Tetenov, A. (2018). Who should be treated? empirical welfare maximization methods for treatment choice. Econometrica , 86(2), 591--616

work page 2018

[34] [34]

& Shaikh, A

Lee, S. & Shaikh, A. M. (2014). Multiple testing and heterogeneous treatment effects: Re-evaluating the effect of progresa on school enrollment. Journal of Applied Econometrics , 29, 612--626

work page 2014

[35] [35]

T., Schnakenberg, K

Little, A. T., Schnakenberg, K. E., & Turner, I. R. (2022). Motivated reasoning and democratic accountability. American Political Science Review , 116(2), 751--767

work page 2022

[36] [36]

Manski, C. F. (1997). Monotone treatment response. Econometrica , 65(6), 1311--1334

work page 1997

[37] [37]

McClelland, G. H. & Judd, C. M. (1993). Statistical difficulties of detecting interactions and moderator effects. Psychological Bulletin , 114(2), 376--390

work page 1993

[38] [38]

Moscowitz, D. (2021). Local news, information, and the nationalization of u.s. elections. American Political Science Review , 115(1), 114--129

work page 2021

[39] [39]

Neyman, J. (1923). Sur les applications de la theorie des probabilites aux experiences agricoles: essai des principes (masters thesis); justification of applications of the calculus of probabilities to the solutions of certain questions in agricultural experimentation. excerpts english translation (reprinted). Statistical Science , 5, 463--472

work page 1923

[40] [40]

o mberg, U., & Bj\

Nilsson, A., Bonander, C., Str\" o mberg, U., & Bj\" o rk, J. (2021). A directed acyclic graph for interactions. International Journal of Epidemiology , 50(2), 613--619

work page 2021

[41] [41]

Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology , 66(5), 688

work page 1974

[42] [42]

Slough, T. (2023). Phantom counterfactuals. American Journal of Political Science , 67(1), 137--153

work page 2023

[43] [43]

Slough, T. (2024). Bureaucratic quality and electoral accountability. American Political Science Review , 118(4), 1931--1950

work page 2024

[44] [44]

& Tyson, S

Slough, T. & Tyson, S. A. (2023). External validity and meta-analysis. American Journal of Political Science , 67(2), 440--455

work page 2023

[45] [45]

& Tyson, S

Slough, T. & Tyson, S. A. (2024). External Validity and Evidence Accumulation . New York: Cambridge University Press

work page 2024

[46] [46]

& Tyson, S

Slough, T. & Tyson, S. A. (2025). Sign-congruence, external validity, and replication. Political Analysis , 33(3), 195--210

work page 2025

[47] [47]

Weinberg, C. R. (2007). Can dags clarify effect moderation? Epidemiology , 18(5), 569--572

work page 2007