Robust Estimation and Inference with Selective Borrowing in Hybrid Controlled Trials: A Tutorial with SelectiveIntegrative and intFRT
Pith reviewed 2026-07-02 08:15 UTC · model grok-4.3
The pith
Hybrid controlled trials can borrow strength from external controls through selective strategies after eligibility alignment and matching while preserving valid inference.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper presents a structured workflow that uses eligibility alignment, matching, and selective borrowing to incorporate external controls into hybrid controlled trials in a manner that supports both efficient point estimation and valid inference via asymptotic or randomization-based procedures, as implemented in the accompanying R packages.
What carries the argument
Selective borrowing strategy that decides whether and how much to incorporate external controls after eligibility alignment and matching to reduce bias from covariate shift or outcome drift.
If this is right
- Analyses gain statistical efficiency compared with randomized data alone when selective borrowing is applied appropriately.
- Randomization tests supply valid inference even after borrowing decisions are made.
- The provided R packages support step-by-step, reproducible execution of alignment, matching, and borrowing.
- Reporting the alignment and borrowing choices increases transparency of the final estimates.
Where Pith is reading between the lines
- The same alignment-plus-selective-borrowing steps could be adapted to other settings with external data, such as rare-disease trials.
- Pre-specifying the borrowing rule in a protocol would strengthen the case for regulatory use of the resulting estimates.
- Performance under varying degrees of data quality in the external source could be checked in future applications beyond the lung-cancer illustration.
Load-bearing premise
Identification assumptions continue to hold after eligibility alignment and matching, with no undetected outcome drift remaining in the borrowed external controls.
What would settle it
A simulation or real-data analysis in which selective borrowing still produces coverage rates below the nominal level or biased point estimates when outcome drift is present would show the workflow fails to deliver valid inference.
Figures
read the original abstract
Hybrid controlled trials (HCTs) augment randomized controlled trials (RCTs) with external controls (ECs) to improve statistical efficiency when RCTs face limited sample sizes, slow accrual, or ethical constraints. However, valid use of ECs requires careful adjustment for covariate shift and outcome drift, as inappropriate borrowing may introduce bias and compromise inference. This tutorial provides a practical workflow for estimation and inference in HCTs. We first present a statistical analysis roadmap covering estimands, identification assumptions, eligibility alignment, matching, full and selective borrowing strategies, and both asymptotic inference and randomization tests. We then demonstrate step-by-step implementation using the SelectiveIntegrative and intFRT packages. The workflow is illustrated using a synthetic lung cancer dataset included in the intFRT package that mimics the CALGB 9633 trial and ECs from the National Cancer Database. The tutorial aims to help applied statisticians conduct transparent, interpretable, and reproducible HCT analyses that improve efficiency while maintaining valid inference.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a tutorial on robust estimation and inference for hybrid controlled trials (HCTs) that augment RCTs with external controls. It presents a statistical analysis roadmap covering estimands, identification assumptions, eligibility alignment, matching, full and selective borrowing strategies, asymptotic inference, and randomization tests, then demonstrates step-by-step implementation via the SelectiveIntegrative and intFRT R packages on a synthetic lung cancer dataset mimicking the CALGB 9633 trial and National Cancer Database controls.
Significance. If the workflow correctly implements the stated assumptions and packages, the tutorial provides applied statisticians with a transparent, reproducible framework for HCT analyses that can improve efficiency in small-sample or ethically constrained settings while preserving valid inference.
major comments (1)
- [Identification assumptions and eligibility alignment] The section on identification assumptions and eligibility alignment: the claim that selective borrowing maintains valid inference after matching rests on the assumption that outcome drift is fully addressed by alignment; without explicit sensitivity analyses or simulation results quantifying residual bias under plausible violations, the robustness of the workflow for the central estimands is not fully demonstrated.
minor comments (2)
- The description of the synthetic dataset could specify the exact covariate distributions and outcome drift parameters used to mimic the real trial, to allow readers to evaluate how closely the demonstration matches realistic HCT scenarios.
- Package installation and dependency instructions are not detailed in the provided roadmap; adding a brief reproducibility checklist would improve usability for applied readers.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and constructive feedback on our tutorial manuscript. We address the major comment in detail below.
read point-by-point responses
-
Referee: [Identification assumptions and eligibility alignment] The section on identification assumptions and eligibility alignment: the claim that selective borrowing maintains valid inference after matching rests on the assumption that outcome drift is fully addressed by alignment; without explicit sensitivity analyses or simulation results quantifying residual bias under plausible violations, the robustness of the workflow for the central estimands is not fully demonstrated.
Authors: The referee correctly identifies a limitation in the current manuscript. While the tutorial clearly states the identification assumptions and explains how eligibility alignment and selective borrowing are intended to address outcome drift, it does not include dedicated sensitivity analyses or simulation studies to quantify potential residual bias when these assumptions are violated to varying degrees. We agree that such analyses would better demonstrate the robustness of the proposed workflow. Accordingly, we will make a partial revision by adding a brief discussion and example of sensitivity analysis in the revised manuscript, using the synthetic lung cancer data to illustrate how users might explore the impact of residual outcome drift. This addition will not alter the core roadmap or package demonstrations but will enhance the tutorial's guidance on practical application. revision: partial
Circularity Check
No significant circularity; tutorial on existing workflow
full rationale
The paper is a methods tutorial demonstrating implementation of HCT analysis using the SelectiveIntegrative and intFRT packages on synthetic data. It outlines a roadmap of estimands, assumptions, alignment, borrowing strategies, and inference without new derivations, fitted parameters presented as predictions, or load-bearing self-citations. No equations or self-referential reductions appear; the content is guidance under stated assumptions and is self-contained.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
2018 , note =
Understanding and misunderstanding randomized controlled trials , journal =. 2018 , note =
2018
-
[2]
BMJ (Clinical research ed.) , author=
Understanding controlled trials: Why are randomised controlled trials important? , volume=. BMJ (Clinical research ed.) , author=. 1998 , month=. doi:https://doi.org/10.1136/bmj.316.7126.201 , number=
-
[3]
Biostatistics , volume=
Bayesian adaptive basket trial design using model averaging , author=. Biostatistics , volume=. 2021 , publisher=
2021
-
[4]
Contemporary clinical trials , volume=
The use of unequal randomisation ratios in clinical trials: a review , author=. Contemporary clinical trials , volume=. 2006 , publisher=
2006
-
[5]
RCTs; real-world evidence; synthetic control
Thorlund, Kristian and Dron, Louis and Park, Jay J H and Mills, Edward J. Synthetic and external controls in clinical trials - A primer for researchers. Clin. Epidemiol. 2020 , keywords = "RCTs; real-world evidence; synthetic control", language = "en", doi =
2020
-
[6]
Reid , title =
N. Reid , title =. The Annals of Statistics , number =
-
[7]
Cole and Michele Jonsson Funk and Michael R
Michael Valancius and Herbert Pang and Jiawen Zhu and Stephen R. Cole and Michele Jonsson Funk and Michael R. Kosorok , title =. Biometrics , volume =
-
[8]
Advances in Neural Information Processing Systems , year=
A unified framework for the transportability of population-level causal measures , author=. Advances in Neural Information Processing Systems , year=
-
[9]
Biometrics , volume =
Li, Xinyu and Miao, Wenjing and Lu, Feng and Zhou, Xihong , title =. Biometrics , volume =. 2023 , doi =
2023
-
[10]
Biometrika , volume=
Improving randomized controlled trial analysis via data-adaptive borrowing , author=. Biometrika , volume=. 2025 , publisher=
2025
-
[11]
Biometrika , volume =
Gao, Chenyin and Yang, Shu and Kim, Jae Kwang , title =. Biometrika , volume =. 2023 , month =
2023
-
[12]
Adjuvant paclitaxel plus carboplatin compared with observation in stage
Strauss, Gary M and Herndon, James E and Maddaus, Michael A and Johnstone, David W and Johnson, Elizabeth A and Harpole, David H and Gillenwater, Heidi H and Watson, Dorothy M and Sugarbaker, David J and Schilsky, Richard L and others , journal=. Adjuvant paclitaxel plus carboplatin compared with observation in stage. 2008 , publisher=
2008
-
[13]
Matching methods for causal inference: A review and a look forward
Stuart, Elizabeth A. Matching methods for causal inference: A review and a look forward. Stat. Sci. 2010 , language = "en", doi =
2010
-
[14]
Thoemmes and Eun Sook Kim , title =
Felix J. Thoemmes and Eun Sook Kim , title =. Multivariate Behavioral Research , volume =. 2011 , publisher =. doi:10.1080/00273171.2011.540475 , note =
-
[15]
Acute care; Matching; Propensity score; Surgery; Trauma
Zakrison, T L and Austin, P C and McCredie, V A. A systematic review of propensity score methods in the acute care surgery literature: avoiding the pitfalls and proposing a set of reporting guidelines. Eur. J. Trauma Emerg. Surg. 2018 , keywords = "Acute care; Matching; Propensity score; Surgery; Trauma", doi =
2018
-
[16]
The American Statistician , volume =
Jesse Hemerik , title =. The American Statistician , volume =. 2024 , publisher =
2024
-
[17]
Boca Raton
Ding, Peng. A first course in causal inference. 2024 , address = "Boca Raton", language = "en", doi =
2024
-
[18]
, title =
Khan, Arif and Fahl Mar, Kaysee and Brown, Walter A. , title =. The American Journal of Psychiatry , volume =
-
[19]
An Estimator-Robust Design for Augmenting Randomized Controlled Trials with External Real-World Data
An Estimator-Robust Design for Augmenting Randomized Controlled Trial with External Real-World Data , author=. arXiv:2501.17835 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[20]
, title =
Austin, Peter C. , title =. Multivariate Behavioral Research , volume =. 2011 , doi =
2011
-
[21]
and Pang, H
Fu, C. and Pang, H. and Zhou, S. and Zhu, J. , title =. Pharmaceutical Statistics , volume =. 2023 , doi =
2023
-
[22]
A tutorial on conformal prediction
Glenn Shafer and Vladimir Vovk. A tutorial on conformal prediction. Journal of Machine Learning Research. 2008
2008
-
[23]
2021 , pages =
Predictive inference with the jackknife+ , journal =. 2021 , pages =
2021
-
[24]
Proceedings of the 42nd International Conference on Machine Learning , year=
Enhancing Statistical Validity and Power in Hybrid Controlled Trials: A Randomization Inference Approach with Conformal Selective Borrowing , author=. Proceedings of the 42nd International Conference on Machine Learning , year=
-
[25]
Biometrics , volume=
Improving trial generalizability using observational studies , author=. Biometrics , volume=. 2023 , publisher=
2023
-
[26]
Journal of Biopharmaceutical Statistics , volume=
Transporting survival of an HIV clinical trial to the external target populations , author=. Journal of Biopharmaceutical Statistics , volume=. 2024 , publisher=
2024
-
[27]
Journal of Biopharmaceutical Statistics , volume=
genrct: A statistical analysis framework for generalizing RCT findings to real-world population , author=. Journal of Biopharmaceutical Statistics , volume=. 2024 , publisher=
2024
-
[28]
Contemporary Clinical Trials , pages=
Trials augmented by external control data using balancing weights: A comparison of estimands and estimators , author=. Contemporary Clinical Trials , pages=. 2026 , publisher=
2026
-
[29]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Statistical testing under distributional shifts , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2023 , publisher=
2023
-
[30]
Vaart, A. W. van der , year=. Asymptotic Statistics , publisher=
-
[31]
and Peek, N
Barrowman, M.A. and Peek, N. and Lambie, M. and others , title =. BMC Medical Research Methodology , volume =
-
[32]
Tibshirani, R. J. and Taylor, J. and Lockhart, R. and Tibshirani, R. , title =. Journal of the American Statistical Association , volume =
-
[33]
Lee, J. D. and Sun, D. L. and Sun, Y. and Taylor, J. E. , title =. Annals of Statistics , volume =
-
[34]
Gadbury, G. L. and Page, G. P. and Heo, M. and Mountz, J. D. and Allison, D. B. , title =. Journal of the Royal Statistical Society: Series C (Applied Statistics) , volume =
-
[35]
2025 , eprint=
Randomization Inference: Theory and Applications , author=. 2025 , eprint=
2025
-
[36]
2023 , eprint=
Randomization Tests for Adaptively Collected Data , author=. 2023 , eprint=
2023
-
[37]
Rosenbaum , title =
Paul R. Rosenbaum , title =. Statistical Science , number =. 2002 , doi =
2002
-
[38]
Rabideau, D. J. and Wang, R. , title =. Biostatistics (Oxford, England) , volume =
-
[39]
and Simon, N
Simon, R. and Simon, N. R. , title =. Statistics & Probability Letters , volume =
-
[40]
Rubin, D. B. , title =. Journal of the American Statistical Association , volume =
-
[41]
Rubin, D. B. , title =. The Annals of Statistics , volume =
-
[42]
Briggs, T. A. and Bryant, M. and Smyth, R. L. , title =. 2006 , journal =
2006
-
[43]
Pulsipher, M. A. and Lehmann, L. E. and Bertuch, A. A. and Sasa, G. and Olson, T. and Nakano, T. and Gilio, A. and Burroughs, L. M. and Lipton, J. M. and Huang, J. N. and Dickerson, K. and Bertaina, A. and Zhuang, C. and Malsch, M. and Fleming, M. and Weller, E. and Shimamura, A. and Williams, D. A. , title =. 2020 , journal =
2020
-
[44]
The New England Journal of Medicine , volume =
Miller, Franklin and Joffe, Steven , title =. The New England Journal of Medicine , volume =
-
[45]
Doubly robust augmented weighting estimators for the analysis of externally controlled single-arm trials and unanchored indirect treatment comparisons , author=. arXiv:2505.00113 , year=
-
[46]
Observational studies , volume=
Nonparametric identification is not enough, but randomized controlled trials are , author=. Observational studies , volume=
-
[47]
Nature medicine , volume=
Clinical AI tools must convey predictive uncertainty for each individual patient , author=. Nature medicine , volume=. 2023 , publisher=
2023
-
[48]
Huang and C
Y. Huang and C. Y. Huang and M. O. Kim , title =. Statistics in Medicine , volume =. 2023 , doi =
2023
-
[49]
Chen and J
Z. Chen and J. Ning and Y. Shen and J. Qin , title =. Biometrics , volume =. 2021 , doi =
2021
-
[50]
Yang and C
S. Yang and C. Gao and D. Zeng and X. Wang , title =. Journal of the Royal Statistical Society. Series B, Statistical Methodology , volume =. 2023 , doi =
2023
-
[51]
E. M. Alt and X. Chang and X. Jiang and Q. Liu and M. Mo and H. A. Xia and J. G. Ibrahim , title =. Biometrics , volume =. 2024 , doi =
2024
-
[52]
B. P. Hobbs and B. P. Carlin and S. J. Mandrekar and D. J. Sargent , title =. Biometrics , volume =. 2011 , doi =
2011
-
[53]
Cheng and L
Y. Cheng and L. Wu and S. Yang , title =. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence , series =
-
[54]
Journal of Causal Inference , volume =
Yanyao Yi and Ying Zhang and Yu Du and Ting Ye , title =. Journal of Causal Inference , volume =. 2023 , doi =
2023
-
[55]
Covariate Adjustment in Analyzing Randomized Clinical Trials: Approaches, Software, and Application
Liu, Jiajun and Wang, Xiaofei and Pang, Herbert. Covariate Adjustment in Analyzing Randomized Clinical Trials: Approaches, Software, and Application. Biostatistics in Biopharmaceutical Research and Development: Clinical Trial Analysis, Volume 2. 2024
2024
-
[56]
Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =
Randomisation inference beyond the sharp null: bounded null hypotheses and quantiles of individual treatment effects , author =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2023 , month =
2023
-
[57]
Cohen and Colin B
Peter L. Cohen and Colin B. Fogarty , title=. Journal of the Royal Statistical Society Series B , year=2022, volume=
2022
-
[58]
PROPENSITY SCORE WEIGHTING ANALYSIS OF SURVIVAL OUTCOMES USING PSEUDO-OBSERVATIONS
Shuxi Zeng and Fan Li and Liangyuan Hu and Fan Li. PROPENSITY SCORE WEIGHTING ANALYSIS OF SURVIVAL OUTCOMES USING PSEUDO-OBSERVATIONS. Statistica Sinica. 2023. doi:10.5705/ss.202021.0175
-
[59]
Proceedings of the 42nd International Conference on Machine Learning , year=
Doubly protected estimation for survival outcomes utilizing external controls for randomized clinical trials , author=. Proceedings of the 42nd International Conference on Machine Learning , year=
-
[60]
Biometrical Journal , volume=
Power gains by using external information in clinical trials are typically not possible when requiring strict type I error control , author=. Biometrical Journal , volume=. 2020 , publisher=
2020
-
[61]
Statistics in Biopharmaceutical Research , volume=
Beyond the classical type I error: Bayesian metrics for Bayesian designs using informative priors , author=. Statistics in Biopharmaceutical Research , volume=. 2024 , publisher=
2024
-
[62]
Pharmaceutical Statistics , volume=
Simulating and reporting frequentist operating characteristics of clinical trials that borrow external information: Towards a fair comparison in case of one-arm and hybrid control two-arm trials , author=. Pharmaceutical Statistics , volume=. 2024 , publisher=
2024
-
[63]
Pharmaceutical Statistics , volume=
Control of Unconditional Type I Error in Clinical Trials With External Control Borrowing—A Two-Stage Adaptive Design Perspective , author=. Pharmaceutical Statistics , volume=. 2025 , publisher=
2025
-
[64]
Statistics in Biopharmaceutical Research , volume=
Regulatory guidance on randomization and the use of randomization tests in clinical trials: a systematic review , author=. Statistics in Biopharmaceutical Research , volume=. 2024 , publisher=
2024
-
[65]
American journal of epidemiology , volume=
The future of causal inference , author=. American journal of epidemiology , volume=. 2022 , publisher=
2022
-
[66]
Transactions on Machine Learning Research , issn=
Multi-Source Causal Inference Using Control Variates under Outcome Selection Bias , author=. Transactions on Machine Learning Research , issn=
-
[67]
Journal of the American Statistical Association , volume=
Combining Multiple Observational Data Sources to Estimate Causal Effects , author=. Journal of the American Statistical Association , volume=. 2020 , publisher=
2020
-
[68]
Journal of Causal Inference , volume=
A conditional randomization test to account for covariate imbalance in randomized experiments , author=. Journal of Causal Inference , volume=. 2016 , publisher=
2016
-
[69]
The Annals of Applied Statistics , volume=
Multi-center clinical trials: Randomization and ancillary statistics , author=. The Annals of Applied Statistics , volume=. 2008 , publisher=
2008
-
[70]
Biometrics , volume=
Improving efficiency of inference in clinical trials with external control data , author=. Biometrics , volume=. 2023 , publisher=
2023
-
[71]
arXiv preprint arXiv:2007.12922 , year=
Improved inference for heterogeneous treatment effects using real-world data subject to hidden confounding , author=. arXiv:2007.12922 , year=
-
[72]
Integrative
Wu, Lili and Yang, Shu , booktitle =. Integrative. 2022 , volume =
2022
-
[73]
arXiv preprint arXiv:2202.12891 , year=
Combining observational and randomized data for estimating heterogeneous treatment effects , author=. arXiv:2202.12891 , year=
-
[74]
A double machine learning approach to combining experimental and observational data , author=. arXiv:2307.01449v2 , year=
-
[75]
Wiley Interdisciplinary Reviews: Computational Statistics , volume=
Data integration in causal inference , author=. Wiley Interdisciplinary Reviews: Computational Statistics , volume=. 2023 , publisher=
2023
-
[76]
Statistical Science , volume=
Causal inference methods for combining randomized trials and observational studies: A review , author=. Statistical Science , volume=. 2024 , publisher=
2024
-
[77]
Annual Review of Statistics and Its Application , volume=
A review of generalizability and transportability , author=. Annual Review of Statistics and Its Application , volume=. 2023 , publisher=
2023
-
[78]
arXiv preprint arXiv:2502.17741 , year=
A unified framework for semiparametrically efficient semi-supervised learning , author=. arXiv:2502.17741 , year=
-
[79]
Pharmaceutical Statistics , volume=
Use of historical control data for assessing treatment effects in clinical trials , author=. Pharmaceutical Statistics , volume=. 2014 , publisher=
2014
-
[80]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Elastic integrative analysis of randomised trial and real-world data for treatment heterogeneity estimation , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2023 , publisher=
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.