Generalizing conditional average treatment effects from nested randomized trials to all trial-eligible individuals
Pith reviewed 2026-06-30 20:12 UTC · model grok-4.3
The pith
Conditional average treatment effects in a target population can be estimated from nested trials by constructing pseudo-outcomes from conditional influence functions followed by local linear regression.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In a nested trial design, the conditional average treatment effect in the full target population of trial-eligible individuals is identified as a function of prespecified effect modifiers. This is achieved by estimating nuisance functions with flexible methods, building pseudo-outcomes from the conditional influence functions, and then applying local linear kernel regression to those pseudo-outcomes, with sample splitting and cross-fitting to ensure asymptotically valid inference.
What carries the argument
Pseudo-outcomes constructed from conditional influence functions, which are then fed into local linear kernel regression to recover the target-population CATE function.
If this is right
- The CATE function can be estimated consistently for the full target population under the nested trial identification conditions.
- Cross-fitting produces asymptotically valid inference for the estimated CATE surface.
- The estimator exhibits reliable finite-sample behavior in simulations.
- Application to the CASS study yields estimates of treatment effect heterogeneity in the broader eligible population.
Where Pith is reading between the lines
- The same pseudo-outcome construction could be tested in linked observational cohorts that resemble nested trials.
- Results depend on the effect modifiers being chosen before seeing the data.
Load-bearing premise
The nested trial design and prespecified effect modifiers suffice to identify the target-population CATE via the conditional influence functions.
What would settle it
A nested trial dataset in which the pseudo-outcomes constructed from conditional influence functions produce CATE estimates that deviate from the true target-population values when positivity or consistency fails would falsify the identification step.
Figures
read the original abstract
Randomized controlled trials often enroll participants whose characteristics differ from those of a target population, which can limit the generalizability of the estimated treatment effects when effect modifiers differ across populations. While existing generalizability methods primarily focus on estimating the average treatment effect (ATE) in the target population, such summaries may obscure important heterogeneity that is relevant for clinical and policy decision-making. In this work, we illustrate an approach for estimating the conditional average treatment effect (CATE) in a target population of trial-eligible individuals as a function of prespecified effect modifiers within a nested trial setting. Our approach combines semiparametric theory with flexible estimation: we first estimate nuisance functions using data-adaptive methods and construct pseudo-outcomes from conditional influence functions, then estimate the CATE function via local linear (kernel) regression. Sample splitting and cross-fitting are used to reduce overfitting bias and ensure asymptotic valid inference. Finite-sample performance is assessed via simulations and illustrated in the Coronary Artery Surgery Study (CASS).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a semiparametric procedure to estimate the conditional average treatment effect (CATE) as a function of prespecified effect modifiers in the full population of trial-eligible individuals, using data from a nested randomized trial. Nuisance functions are estimated with data-adaptive methods, pseudo-outcomes are formed from conditional influence functions, and the CATE is then fit by local linear kernel regression with cross-fitting for inference. Finite-sample behavior is examined in simulations and illustrated on the CASS data.
Significance. If the identification and estimation steps hold, the work supplies a practical route from nested-trial data to heterogeneous-effect estimates in the broader target population, moving beyond ATE-only generalizability results. The combination of influence-function pseudo-outcomes with flexible regression and the inclusion of a real-data example are constructive features.
major comments (2)
- [Abstract and Methods] Abstract and Methods: the central claim that conditional influence functions yield unbiased pseudo-outcomes for the target-population CATE rests on positivity of trial participation given the effect modifiers and covariates, yet the manuscript provides no explicit statement or derivation of this condition (or of consistency under nested sampling). Without these, the subsequent local-linear step inherits bias even under correct nuisance estimation.
- [Identification/Estimation procedure] Identification/Estimation procedure: no section derives the precise identifying assumptions or shows that the constructed pseudo-outcomes are unbiased for the target CATE under the nested design; the description invokes standard semiparametric theory without verifying the transport step for the specific setting.
minor comments (1)
- [Methods] Notation for the conditional influence functions and the kernel regression bandwidth choice could be clarified with explicit definitions or a small worked example.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. The two major points both concern the need for explicit derivation of identifying assumptions (positivity and consistency under nested sampling) and verification that the conditional influence function pseudo-outcomes are unbiased for the target-population CATE. We agree these elements should be stated more clearly and will add them in revision.
read point-by-point responses
-
Referee: [Abstract and Methods] Abstract and Methods: the central claim that conditional influence functions yield unbiased pseudo-outcomes for the target-population CATE rests on positivity of trial participation given the effect modifiers and covariates, yet the manuscript provides no explicit statement or derivation of this condition (or of consistency under nested sampling). Without these, the subsequent local-linear step inherits bias even under correct nuisance estimation.
Authors: We agree the positivity condition (trial participation given effect modifiers and covariates) and the consistency result under nested sampling are not stated explicitly. In the revised manuscript we will insert a new subsection (likely 3.2) that lists all identifying assumptions, derives the unbiasedness of the conditional influence function pseudo-outcomes for the target CATE, and sketches the consistency argument under the nested design. This will also be referenced in the abstract. revision: yes
-
Referee: [Identification/Estimation procedure] Identification/Estimation procedure: no section derives the precise identifying assumptions or shows that the constructed pseudo-outcomes are unbiased for the target CATE under the nested design; the description invokes standard semiparametric theory without verifying the transport step for the specific setting.
Authors: The manuscript relies on standard semiparametric results without spelling out the transport step for the nested-trial setting. We will add an explicit derivation (new subsection 3.2) that starts from the nested design, states the required positivity and consistency assumptions, and shows that the pseudo-outcomes are unbiased for the target-population CATE. Cross-fitting and local-linear regression steps will then be justified under these assumptions. revision: yes
Circularity Check
No circularity: derivation relies on external semiparametric theory
full rationale
The paper's approach constructs pseudo-outcomes from conditional influence functions (standard semiparametric results) and then applies local linear regression; these steps invoke established identification results whose validity does not reduce to the paper's own fitted quantities or self-citations. No equations are shown that define the target CATE in terms of itself or rename a fit as a prediction. The method is self-contained against external benchmarks in semiparametric statistics.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
write newline
" write newline "" before.all 'output.state := FUNCTION article output.bibitem format.authors "author" output.check author format.key output output.year.check new.block format.title "title" output.check new.block crossref missing format.jour.vol output format.article.crossref output.nonnull format.pages output if new.block note output fin.entry FUNCTION b...
-
[2]
Abrevaya, J., Hsu, Y.-C., and Lieli, R. P. (2015). Estimating conditional average treatment effects. Journal of Business & Economic Statistics , 33(4):485--505
2015
-
[3]
Bowman, A. W. and Azzalini, A. (1997). Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations , volume 18. OUP Oxford
1997
-
[4]
L., Hudgens, M
Buchanan, A. L., Hudgens, M. G., Cole, S. R., Mollan, K. R., Sax, P. E., Daar, E. S., Adimora, A. A., Eron, J. J., and Mugavero, M. J. (2018). Generalizing evidence from randomized trials using inverse probability of sampling weights. Journal of the Royal Statistical Society Series A: Statistics in Society , 181(4):1193--1209
2018
-
[5]
D., and Farrell, M
Calonico, S., Cattaneo, M. D., and Farrell, M. H. (2018). On the effect of bias estimation on coverage accuracy in nonparametric inference. Journal of the American Statistical Association , 113(522):767--779
2018
-
[6]
CASS Principal Investigators (1984). Coronary artery surgery study (cass): a randomized trial of coronary artery bypass surgery: comparability of entry characteristics and survival in randomized patients and nonrandomized patients meeting randomization criteria. Journal of the American College of Cardiology , 3(1):114--128
1984
-
[7]
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal , 21:C1--C68
2018
-
[8]
Chernozhukov, V., Newey, W. K., and Syrgkanis, V. (2024). Conditional influence functions. arXiv preprint arXiv:2412.18080
-
[9]
Cole, S. R. and Stuart, E. A. (2010). Generalizing evidence from randomized clinical trials to target populations: the actg 320 trial. American journal of epidemiology , 172(1):107--115
2010
-
[10]
Colnet, B., Mayer, I., Chen, G., Dieng, A., Li, R., Varoquaux, G., Vert, J.-P., Josse, J., and Yang, S. (2024). Causal inference methods for combining randomized trials and observational studies: a review. Statistical science: a review journal of the Institute of Mathematical Statistics , 39(1):165
2024
-
[11]
J., Haneuse, S
Dahabreh, I. J., Haneuse, S. J. A., Robins, J. M., Robertson, S. E., Buchanan, A. L., Stuart, E. A., and Hern \'a n, M. A. (2021). Study designs for extending causal inferences from a randomized trial to a target population. American journal of epidemiology , 190(8):1632--1642
2021
-
[12]
J., Hayward, R., and Kent, D
Dahabreh, I. J., Hayward, R., and Kent, D. M. (2016). Using group data to treat individuals: understanding heterogeneous treatment effects in the age of precision medicine and patient-centred evidence. International journal of epidemiology , 45(6):2184--2193
2016
-
[13]
J., Robertson, S
Dahabreh, I. J., Robertson, S. E., Steingrimsson, J. A., Stuart, E. A., and Hernan, M. A. (2020). Extending inferences from a randomized trial to a new target population. Statistics in medicine , 39(14):1999--2014
2020
-
[14]
J., Robertson, S
Dahabreh, I. J., Robertson, S. E., Tchetgen, E. J., Stuart, E. A., and Hern \'a n, M. A. (2019). Generalizing causal inferences from individuals in randomized trials to all trial-eligible individuals. Biometrics , 75(2):685--694
2019
-
[15]
Fan, J. (1992). Design-adaptive nonparametric regression. Journal of the American statistical Association , 87(420):998--1004
1992
-
[16]
Fan, J. (1993). Local linear regression smoothers and their minimax efficiencies. The annals of Statistics , pages 196--216
1993
-
[17]
and Gijbels, I
Fan, J. and Gijbels, I. (1992). Variable bandwidth and local linear regression smoothers. The Annals of Statistics , pages 2008--2036
1992
-
[18]
and Gijbels, I
Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications . Chapman & Hall, London
1996
-
[19]
E., and Wand, M
Fan, J., Heckman, N. E., and Wand, M. P. (1995). Local polynomial kernel regression for generalized linear models and quasi-likelihood functions. Journal of the American Statistical Association , 90(429):141--150
1995
-
[20]
Fan, J., Hu, T.-C., and Truong, Y. K. (1994). Robust non-parametric function estimation. Scandinavian journal of statistics , pages 433--446
1994
-
[21]
Hines, O., Dukes, O., Diaz-Ordaz, K., and Vansteelandt, S. (2022). Demystifying statistical learning based on efficient influence functions. The American Statistician , 76(3):292--304
2022
-
[22]
and Newey, W
Ichimura, H. and Newey, W. K. (2022). The influence function of semiparametric estimators. Quantitative Economics , 13(1):29--61
2022
-
[23]
Inoue, K., Adomi, M., Efthimiou, O., Komura, T., Omae, K., Onishi, A., Tsutsumi, Y., Fujii, T., Kondo, N., and Furukawa, T. A. (2024). Machine learning approaches to evaluate heterogeneous treatment effects in randomized controlled trials: a scoping review. Journal of Clinical Epidemiology , 176:111538
2024
-
[24]
Kennedy, E. H. (2023). Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics , 17(2):3008--3049
2023
- [25]
-
[26]
H., Ma, Z., McHugh, M
Kennedy, E. H., Ma, Z., McHugh, M. D., and Small, D. S. (2017). Non-parametric methods for doubly robust estimation of continuous treatment effects. Journal of the Royal Statistical Society Series B: Statistical Methodology , 79(4):1229--1245
2017
-
[27]
M., Nelson, J., Dahabreh, I
Kent, D. M., Nelson, J., Dahabreh, I. J., Rothwell, P. M., Altman, D. G., and Hayward, R. A. (2016). Risk and treatment effect heterogeneity: re-analysis of individual participant data from 32 large clinical trials. International journal of epidemiology , 45(6):2075--2088
2016
-
[28]
M., Rothwell, P
Kent, D. M., Rothwell, P. M., Ioannidis, J. P., Altman, D. G., and Hayward, R. A. (2010). Assessing and reporting heterogeneity in treatment effects in clinical trials: a proposal. Trials , 11(1):85
2010
-
[29]
C., Lechner, M., and Strittmatter, A
Knaus, M. C., Lechner, M., and Strittmatter, A. (2021). Machine learning estimation of heterogeneous causal effects: Empirical monte carlo evidence. The Econometrics Journal , 24(1):134--161
2021
-
[30]
R., Sekhon, J
K \"u nzel, S. R., Sekhon, J. S., Bickel, P. J., and Yu, B. (2019). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences , 116(10):4156--4165
2019
-
[31]
Lechner, M. (2018). Modified causal forests for estimating heterogeneous causal effects. arXiv preprint arXiv:1812.09487
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[32]
Lee, S., Okui, R., and Whang, Y.-J. (2017). Doubly robust uniform confidence band for the conditional average treatment effect function. Journal of Applied Econometrics , 32(7):1207--1225
2017
-
[33]
R., Buchanan, A
Lesko, C. R., Buchanan, A. L., Westreich, D., Edwards, J. K., Hudgens, M. G., and Cole, S. R. (2017). Generalizing study results: a potential outcomes perspective. Epidemiology , 28(4):553--561
2017
-
[34]
and Racine, J
Li, Q. and Racine, J. (2004). Cross-validated local linear nonparametric regression. Statistica Sinica , pages 485--512
2004
-
[35]
and Racine, J
Li, Q. and Racine, J. S. (2007). Nonparametric econometrics: theory and practice . Princeton University Press
2007
-
[36]
Loader, C. R. (1999). Bandwidth selection: classical or plug-in? The Annals of Statistics , 27(2):415--438
1999
-
[37]
L., Westreich, D., Glymour, M
Mehrotra, M. L., Westreich, D., Glymour, M. M., Geng, E., and Glidden, D. V. (2021). Transporting subgroup analyses of randomized controlled trials for planning implementation of new interventions. American journal of epidemiology , 190(8):1671--1680
2021
-
[38]
and Wager, S
Nie, X. and Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika , 108(2):299--319
2021
-
[39]
Olschewski, M., Schumacher, M., and Davis, K. B. (1992). Analysis of randomized and nonrandomized patients in clinical trials using the comprehensive cohort follow-up study design. Controlled clinical trials , 13(3):226--239
1992
-
[40]
E., Leith, A., Schmid, C
Robertson, S. E., Leith, A., Schmid, C. H., and Dahabreh, I. J. (2021). Assessing heterogeneity of treatment effects in observational studies. American Journal of Epidemiology , 190(6):1088--1100
2021
-
[41]
E., Steingrimsson, J
Robertson, S. E., Steingrimsson, J. A., and Dahabreh, I. J. (2023). Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population. European journal of epidemiology , 38(2):123--133
2023
-
[42]
E., Steingrimsson, J
Robertson, S. E., Steingrimsson, J. A., Joyce, N. R., Stuart, E. A., and Dahabreh, I. J. (2024). Estimating subgroup effects in generalizability and transportability analyses. American journal of epidemiology , 193(1):149--158
2024
-
[43]
Rudolph, K. E. and Laan, M. J. (2017). Robust estimation of encouragement design intervention effects transported across sites. Journal of the Royal Statistical Society Series B: Statistical Methodology , 79(5):1509--1525
2017
-
[44]
J., and Wand, M
Ruppert, D., Sheather, S. J., and Wand, M. P. (1995). An effective bandwidth selector for local least squares regression. Journal of the American Statistical Association , 90(432):1257--1270
1995
-
[45]
J., Hong, H., Ackerman, B., Schmid, I., and Stuart, E
Seamans, M. J., Hong, H., Ackerman, B., Schmid, I., and Stuart, E. A. (2021). Generalizability of subgroup effects. Epidemiology , 32(3):389--392
2021
-
[46]
and Chernozhukov, V
Semenova, V. and Chernozhukov, V. (2021). Debiased machine learning of conditional average treatment effects and other causal functions. The Econometrics Journal , 24(2):264--289
2021
-
[47]
A., Cole, S
Stuart, E. A., Cole, S. R., Bradshaw, C. P., and Leaf, P. J. (2011). The use of propensity scores to assess the generalizability of results from randomized trials. Journal of the Royal Statistical Society Series A: Statistics in Society , 174(2):369--386
2011
-
[48]
Tipton, E. (2021). Beyond generalization of the ate: designing randomized trials to understand treatment effect heterogeneity. Journal of the Royal Statistical Society Series A: Statistics in Society , 184(2):504--521
2021
-
[49]
J., and Dahabreh, I
Ung, L., VanderWeele, T. J., and Dahabreh, I. J. (2025). Generalizing and transporting causal inferences from randomized trials in the presence of trial engagement effects. Epidemiology , 36(4):500--510
2025
-
[50]
J., Polley, E
van Der laan, M. J., Polley, E. C., and Hubbard, A. E. (2007). Super learner. Statistical applications in genetics and molecular biology , 6
2007
-
[51]
Wasserman, L. (2006). All of nonparametric statistics . Springer
2006
-
[52]
K., Lesko, C
Westreich, D., Edwards, J. K., Lesko, C. R., Stuart, E., and Cole, S. R. (2017). Transportability of trial results using inverse odds of sampling weights. American journal of epidemiology , 186(8):1010--1014
2017
-
[53]
William, J., Russell, R., Nicholas, T., et al. (1983). Coronary artery surgery study (cass): a randomized trial of coronary artery bypass surgery. Circulation , 68(5):939--950
1983
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.