Recognition: unknown
Improving Treatment Effect Estimation in Trials through Adaptive Borrowing of External Controls
Pith reviewed 2026-05-10 12:54 UTC · model grok-4.3
The pith
An adaptive method borrows only the most compatible external controls to minimize mean squared error in average treatment effect estimates from small randomized trials.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed adaptive influence-based sample borrowing framework quantifies the comparability of each external control sample via influence functions computed from the RCT data and selects the subset that minimizes the mean squared error of the average treatment effect estimator. The approach is assumption-lean with respect to the external control distribution, remains robust to outliers, and is strengthened by an outcome calibration procedure that improves data utilization efficiency.
What carries the argument
Adaptive influence-based sample borrowing, which computes influence functions on RCT data to rank external control samples by their effect on ATE mean squared error and retains the optimal subset.
Load-bearing premise
Influence functions derived from the randomized trial data can reliably flag which external controls are comparable enough to lower overall mean squared error without adding bias.
What would settle it
In repeated simulations or real datasets with known treatment effects, the selected external subset produces higher mean squared error for the ATE estimator than either using no external controls or using all of them.
Figures
read the original abstract
Randomized controlled trials (RCTs) often suffer from limited inferential efficiency in estimating treatment effects due to their small sample sizes. In recent years, incorporating external controls (ECs) has gained increasing attention as an effective way to augment small RCTs and thereby enhance estimation efficiency. However, ECs are not always comparable to RCTs, and direct borrowing without careful evaluation can introduce substantial bias and, paradoxically, undermine the accuracy of treatment effect estimation. In this paper, we propose a novel adaptive influence-based sample borrowing framework to improve average treatment effect (ATE) estimation in RCTs. The framework quantifies the ``comparability'' of each sample in ECs using influence functions and identifies the optimal subset of ECs that minimizes the mean squared error of the ATE estimator. The proposed framework is assumption-lean regarding the distribution of ECs and is robust to outliers, making it broadly applicable across diverse settings. Moreover, we develop an outcome calibration method to improve the data utilization efficiency of ECs, further strengthening the adaptive influence-based sample-borrowing framework. We demonstrate the effectiveness of the proposed method using both simulated and real-world datasets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an adaptive influence-based sample borrowing framework to improve average treatment effect (ATE) estimation in small-sample randomized controlled trials (RCTs). Influence functions computed from the RCT data are used to quantify the comparability of each external control (EC) observation; an optimal subset is then selected via minimization of the finite-sample mean squared error (MSE) of the ATE estimator. An outcome calibration step is added to increase data utilization. The framework is presented as assumption-lean with respect to the EC distribution and robust to outliers, with supporting evidence from simulation studies and a real-world dataset.
Significance. If the finite-sample MSE minimization step can be shown to reliably select bias-reducing EC subsets, the method would offer a practical, data-driven way to augment RCTs without the bias risks of naive borrowing. The combination of influence-function scoring and outcome calibration could advance the literature on external-data integration in causal inference. Credit is due for the explicit focus on finite-sample MSE rather than asymptotic efficiency and for the robustness claim; however, the absence of any derivation, explicit performance metrics, or error analysis prevents a full assessment of whether these strengths materialize.
major comments (3)
- [Method description (influence-function scoring and subset selection)] The core claim that RCT-derived influence functions can be used to score and select an EC subset whose inclusion provably lowers finite-sample MSE (rather than fitting noise) is load-bearing for the entire framework. No derivation or bound is supplied showing that the argmin over subsets yields an estimator whose realized MSE is smaller than both the no-borrowing and oracle-borrowing benchmarks when the EC distribution differs from the RCT even modestly.
- [Simulation and real-data sections] The abstract states effectiveness on simulated and real data but supplies neither numerical results (e.g., MSE reductions, bias, coverage) nor an error analysis. Without these quantities it is impossible to evaluate whether the adaptive procedure outperforms standard borrowing methods or merely selects on RCT-sample noise.
- [Outcome calibration subsection] The outcome calibration method is introduced to improve data utilization, yet no statement is given on how it interacts with the influence-based selection or whether it preserves the MSE-minimization guarantee. This interaction is central to the strengthened framework.
minor comments (2)
- [Abstract] The abstract uses the phrase 'assumption-lean' without defining which assumptions are avoided relative to existing borrowing methods; a brief comparison table would clarify the novelty.
- [Method] Notation for the influence function and the MSE objective is introduced without an explicit equation reference, making the subsequent subset-selection step difficult to follow.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which highlight important areas for strengthening the manuscript. We address each major comment point by point below, indicating the revisions we will make.
read point-by-point responses
-
Referee: [Method description (influence-function scoring and subset selection)] The core claim that RCT-derived influence functions can be used to score and select an EC subset whose inclusion provably lowers finite-sample MSE (rather than fitting noise) is load-bearing for the entire framework. No derivation or bound is supplied showing that the argmin over subsets yields an estimator whose realized MSE is smaller than both the no-borrowing and oracle-borrowing benchmarks when the EC distribution differs from the RCT even modestly.
Authors: We agree that a formal derivation would strengthen the presentation. The influence-function scoring is motivated by the first-order expansion of the ATE estimator, and the subset selection minimizes an empirical finite-sample MSE criterion constructed from those scores. The manuscript does not claim a universal provable bound that holds for arbitrary EC distributions; instead, the procedure is presented as a practical, assumption-lean heuristic. In the revision we will add an explicit derivation of the MSE estimator used for selection, together with a discussion of the conditions under which the selected subset is expected to improve upon the no-borrowing estimator. We will also report additional simulation results that compare the adaptive procedure against both no-borrowing and oracle-borrowing benchmarks under modest distribution shifts. revision: yes
-
Referee: [Simulation and real-data sections] The abstract states effectiveness on simulated and real data but supplies neither numerical results (e.g., MSE reductions, bias, coverage) nor an error analysis. Without these quantities it is impossible to evaluate whether the adaptive procedure outperforms standard borrowing methods or merely selects on RCT-sample noise.
Authors: The full manuscript contains simulation tables and real-data results with explicit MSE, bias, and coverage values, as well as an error analysis based on repeated sampling. We will revise the abstract to include the key quantitative findings (e.g., average MSE reductions and coverage rates across simulation scenarios) and will add a concise summary of the error analysis to the abstract for immediate accessibility. revision: yes
-
Referee: [Outcome calibration subsection] The outcome calibration method is introduced to improve data utilization, yet no statement is given on how it interacts with the influence-based selection or whether it preserves the MSE-minimization guarantee. This interaction is central to the strengthened framework.
Authors: We will add a dedicated paragraph clarifying the sequential relationship: influence-function scoring and subset selection are performed first on the raw EC data; outcome calibration is then applied only to the selected subset to align conditional outcome means. Because calibration is a post-selection adjustment that targets mean differences without changing the selection criterion itself, the MSE-minimization property of the selection step is preserved. We will include a short analytic argument and supporting simulation results demonstrating that the combined procedure continues to target finite-sample MSE reduction. revision: yes
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper introduces a new adaptive borrowing procedure that uses influence functions computed on RCT data to score and select EC subsets, with the selection criterion defined as the subset minimizing an estimated finite-sample MSE of the ATE estimator. This is a constructive algorithmic proposal rather than a derivation that reduces a claimed result back to its own fitted inputs or prior self-citations. No equations or steps in the abstract or described framework exhibit self-definition (e.g., defining comparability via the very MSE quantity being minimized in a closed loop), fitted parameters renamed as predictions, or load-bearing self-citations. Validation occurs via separate simulations and real-data experiments, which are external to the method definition itself. The approach therefore remains assumption-lean and non-circular by the stated criteria.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 2 Pith papers
-
Calibeating Prediction-Powered Inference
Post-hoc calibration of miscalibrated black-box predictions on a labeled sample improves efficiency of prediction-powered inference for semisupervised mean estimation.
-
Adaptive Influence-Based Borrowing Framework for Improving Treatment Effect Estimation in RCTs Using External Controls
The adaptive influence-based borrowing framework selects subsets of external controls by influence scores and chooses the subset minimizing MSE of the treatment effect estimator.
Reference graph
Works this paper leans on
-
[1]
Adaptive combination of randomized and observational data.arXiv preprint arXiv:2111.15012,
David Cheng and Tianxi Cai. Adaptive combination of randomized and observational data.arXiv preprint arXiv:2111.15012,
-
[2]
Generalizing causal inferences from individuals in randomized trials to all trial-eligible individu- als.Biometrics, 75(2):685–694, 2019a
Issa J Dahabreh, Sarah E Robertson, Eric J Tchetgen, Elizabeth A Stuart, and Miguel A Hern´ an. Generalizing causal inferences from individuals in randomized trials to all trial-eligible individu- als.Biometrics, 75(2):685–694, 2019a. Issa J. Dahabreh, Sarah E. Robertson, Eric J. Tchetgen, Elizabeth A. Stuart, and Miguel A. Hern´ an. Generalizing causal i...
1999
-
[3]
Chenyin Gao, Shu Yang, Mingyang Shan, Wenyu Wendy Ye, Ilya Lipkovich, and Douglas Faries
URLhttps://www.fda.gov/ regulatory-information/search-fda-guidance-documents. Chenyin Gao, Shu Yang, Mingyang Shan, Wenyu Wendy Ye, Ilya Lipkovich, and Douglas Faries. Doubly protected estimation for survival outcomes utilizing external controls for randomized clinical trials.arXiv preprint arXiv:2410.18409,
-
[4]
arXiv preprint arXiv:2501.17835 , year=
Sky Qiu, Jens Tarp, Andrew Mertens, and Mark van der Laan. An estimator-robust design for augmenting randomized controlled trial with external real-world data.arXiv preprint arXiv:2501.17835,
-
[5]
The Promises of Multiple Experiments: Identifying Joint Distribution of Potential Outcomes
Peng Wu and Xiaojie Mao. The promises of multiple experiments: Identifying joint distribution of potential outcomes.arXiv preprint arXiv:2504.20470,
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
Peng Wu, Shasha Han, Xingwei Tong, and Runze Li. Propensity score regression for causal inference with treatment heterogeneity.Statistica Sinica, 34:747–769, 2024a. Peng Wu, Ziyu Shen, Feng Xie, Zhongyao Wang, Chunchen Liu, and Yan Zeng. Policy learning for balancing short-term and long-term rewards. InProceedings of the 41st International Conference on M...
-
[7]
By a standard result of Hahn (1998); Chernozhukov et al. (2018), under the conditions in Lemma 2, ˆτaipw = 1 NR X i∈R Ai(Yi −µ 1(Xi)) e1(Xi) − (1−A i)(Yi −µ 0(Xi)) 1−e 1(Xi) + 1 NR X i∈R {µ1(Xi)−µ 0(Xi)}+o P(n−1/2) = 1 n X i∈R∪S Ri q Ai(Yi −µ 1(Xi)) e1(Xi) − (1−A i)(Yi −µ 0(Xi)) 1−e 1(Xi) + 1 n X i∈R∪S Ri q {µ1(Xi)−µ 0(Xi)}+o P(n−1/2). Thus, the asymptoti...
1998
-
[8]
MSE(ˆτS∗)<MSE(ˆτSk)−η for allS k /∈ S∗
2 S1.6. Proof of Theorem 3 Proof of Theorem 3.To show the conclusion lim NR→∞ P( ˆS ∈ S ∗) = 1, it suffices to show that MSE(ˆτˆS)−MSE(ˆτS∗)≤η.(S4) This is because, if (S4) holds and ˆS/∈ S∗, it contradicts the condition “MSE(ˆτS∗)<MSE(ˆτSk)−η for allS k /∈ S∗”, and thus we must have ˆS ∈ S ∗. Next, we prove (S4). We consider a decomposition of MSE(ˆτ ˆS)...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.