pith. sign in

arxiv: 2604.20667 · v1 · submitted 2026-04-22 · 📊 stat.ME

Data Integration for Estimating Subgroup-Specific Conditional Average Treatment Effects (CATEs) Using Coarsened External Information in Randomized Trials

Pith reviewed 2026-05-09 23:37 UTC · model grok-4.3

classification 📊 stat.ME
keywords James-Stein estimatorconditional average treatment effectssubgroup analysisdata integrationrandomized trialsshrinkage estimationtreatment effect heterogeneityexternal data
0
0 comments X

The pith

A James-Stein-type estimator that integrates coarsened external data uniformly dominates internal OLS for fine subgroup CATE estimation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method to estimate treatment effects in small, fine-grained subgroups within randomized trials where internal sample sizes are too limited for reliable results. It proposes borrowing strength from external trials that supply only coarser marginal estimates while allowing for possible differences between the populations studied. The central result is that this shrinkage estimator improves upon using only internal data alone, as measured by a weighted quadratic loss, under mild conditions. An analytic variance formula is derived that performs well in simulations, and the approach is shown to detect subgroup differences in a real weight-loss trial that internal data alone miss.

Core claim

We propose a James-Stein-type estimator for subgroup-specific conditional average treatment effects that borrows from coarsened external information. Under mild conditions, this estimator uniformly dominates the ordinary least squares estimator based only on internal data in terms of a weighted quadratic loss. We derive an analytic variance estimator and demonstrate its use in a real trial example where it reveals a significant difference between female-White and female-Asian subgroups.

What carries the argument

The James-Stein-type shrinkage estimator that adjusts internal fine subgroup CATE estimates by incorporating information from external marginal CATEs at coarser levels, while modeling population incompatibility.

If this is right

  • The estimator provides more precise CATE estimates for cross-classified subgroups with sparse internal data.
  • It enables detection of treatment effect differences across subgroups that internal data alone cannot reliably show.
  • An analytic variance estimator is available that exhibits acceptable performance in simulations.
  • Simulations show favorable results compared with empirical Bayes and generalized ridge shrinkage methods.
  • The method can be applied directly to existing trial data such as SURMOUNT-1 borrowing from STEP trials.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The integration approach could extend to multiple external sources or varying degrees of coarsening beyond binary covariates.
  • Trial designers might plan data collection and reporting to facilitate such borrowing from future external studies.
  • The strategy underscores the reuse value of publishing marginal subgroup estimates from completed trials.

Load-bearing premise

The incompatibility between marginal CATEs across populations can be appropriately modeled and the conditions for the uniform dominance result hold.

What would settle it

A simulation or real-data analysis in which the weighted quadratic loss of the JS estimator exceeds that of the internal OLS estimator when external incompatibility is large.

Figures

Figures reproduced from arXiv: 2604.20667 by Bhramar Mukherjee, Walter Dempsey, Youqi Yang.

Figure 1
Figure 1. Figure 1: Race-by-sex subgroup sample sizes from SURMOUNT-1 and external [PITH_FULL_IMAGE:figures/full_fig_p027_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Target risk comparison under prevalence weighting for the fine-grained [PITH_FULL_IMAGE:figures/full_fig_p028_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Empirical coverage of naive Wald-type 95% CIs for fine-grained [PITH_FULL_IMAGE:figures/full_fig_p029_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Estimates and naive Wald-type 95% CIs for fine-grained subgroup [PITH_FULL_IMAGE:figures/full_fig_p030_4.png] view at source ↗
read the original abstract

Randomized controlled trials (RCTs) are often underpowered to detect treatment heterogeneity in subgroups defined by cross-classifications of multiple covariates, due to sparse sample sizes in some strata. External RCT data can help, but typically provide treatment effect estimates at a coarser level (e.g., by sex or race) rather than for the finer subgroups of interest (e.g., race-by-sex). We propose a novel James-Stein (JS)-type estimator that borrows strength from such coarsened external estimates to improve estimation of finer subgroup-specific conditional average treatment effects (CATEs) in an internal study, while accommodating potential incompatibility in marginal CATEs across populations. Based on asymptotic theory, we derive a practical analytic variance estimator for the JS estimator that exhibits acceptable empirical performance. Under mild conditions, we show that the proposed estimator uniformly dominates the ordinary least squares (OLS) estimator based on internal data regarding a weighted quadratic loss. Simulation studies demonstrate favorable performance compared with existing shrinkage methods, including empirical Bayes and generalized ridge estimators. We illustrate our method by estimating race-by-sex subgroup CATEs in a tirzepatide weight-loss trial (SURMOUNT-1), borrowing sex-specific and race-specific estimates from two previous semaglutide trials (STEP 1 and STEP 2). The proposed method detects a significantly larger treatment effect on percentage weight loss in the female-White subgroup than in the female-Asian subgroup, a difference not detected using internal data alone.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a James-Stein-type shrinkage estimator for integrating coarsened external RCT data to estimate fine subgroup-specific CATEs within an internal RCT. The approach accommodates incompatibility between marginal CATEs across populations, derives an analytic variance estimator, establishes uniform dominance over the internal OLS estimator under a weighted quadratic loss, and supports the method with simulations and an application to the SURMOUNT-1 trial using sex- and race-specific estimates from the STEP trials.

Significance. If the uniform dominance result holds, the work provides a theoretically justified approach to improving precision in underpowered subgroup analyses by leveraging external coarsened information without requiring individual-level external data. The analytic variance estimator and simulation comparisons to empirical Bayes and ridge methods add practical utility. The real-data example illustrates the potential to detect subgroup differences not apparent from internal data alone.

major comments (2)
  1. [§4, Theorem 1] §4, Theorem 1 (uniform dominance): The proof must explicitly show how the incompatibility between internal and external marginal CATEs enters the weighted quadratic risk function (e.g., via an additive bias term or adjusted target). Without this parameterization made transparent, it is unclear whether the risk difference remains non-positive for all subgroup effect configurations, including large discrepancies between populations. This is load-bearing for the headline claim.
  2. [§3.2] §3.2, asymptotic variance derivation: The analytic variance estimator for the JS estimator should be shown to remain consistent when the shrinkage intensity is estimated from data and when external marginal estimates carry their own sampling variability. The current description leaves open whether these sources of uncertainty are fully propagated.
minor comments (2)
  1. [Simulation studies] Simulation section: The tables reporting coverage and MSE would benefit from an additional column or panel that varies the degree of marginal incompatibility to directly illustrate robustness of the dominance result.
  2. [Application] Application section: The reported subgroup CATE differences (female-White vs. female-Asian) should include the estimated shrinkage intensity and the external marginal estimates used, to allow readers to assess the degree of borrowing.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive evaluation and constructive comments, which have helped clarify key aspects of our theoretical results. We address each major comment below and have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: [§4, Theorem 1] §4, Theorem 1 (uniform dominance): The proof must explicitly show how the incompatibility between internal and external marginal CATEs enters the weighted quadratic risk function (e.g., via an additive bias term or adjusted target). Without this parameterization made transparent, it is unclear whether the risk difference remains non-positive for all subgroup effect configurations, including large discrepancies between populations. This is load-bearing for the headline claim.

    Authors: We appreciate the referee's emphasis on making the role of incompatibility fully transparent in the proof of Theorem 1. The incompatibility between internal and external marginal CATEs is parameterized by treating the external estimates as a biased proxy for the internal marginals; this enters the weighted quadratic risk as an additive term equal to the squared discrepancy scaled by (1 - λ)^2, where λ is the shrinkage intensity. The risk difference between the JS estimator and internal OLS is then shown to be non-positive for all configurations under the mild conditions (0 ≤ λ ≤ 1 with λ chosen to minimize the risk expression). When discrepancies are large, the data-driven λ approaches 0, recovering the OLS estimator and ensuring the difference is zero. We have expanded the proof in the revised §4 with an explicit derivation of this parameterization, added a remark illustrating the large-discrepancy case, and included an additional lemma confirming the sign of the risk difference. revision: yes

  2. Referee: [§3.2] §3.2, asymptotic variance derivation: The analytic variance estimator for the JS estimator should be shown to remain consistent when the shrinkage intensity is estimated from data and when external marginal estimates carry their own sampling variability. The current description leaves open whether these sources of uncertainty are fully propagated.

    Authors: We agree that explicit verification of consistency is needed when the shrinkage intensity is estimated and external estimates contribute sampling variability. In the revised manuscript, we have augmented the asymptotic analysis in §3.2 to derive the limiting distribution of the variance estimator under plug-in estimation of λ and inclusion of external variances. Under standard regularity conditions (consistent estimation of λ at rate faster than n^{-1/2} and bounded external sample sizes), the additional variability terms are shown to be o_p(1), preserving consistency. We have also added a small simulation study in the supplement confirming that the resulting confidence intervals maintain nominal coverage even when external estimates are noisy. revision: yes

Circularity Check

0 steps flagged

No circularity: dominance result follows from independent asymptotic derivation treating external coarsened estimates as fixed inputs

full rationale

The paper's central claim is a uniform dominance result for the proposed JS-type estimator over internal OLS under weighted quadratic loss, derived from asymptotic theory under mild conditions that explicitly accommodate incompatibility between marginal CATEs. External information enters as given coarsened estimates rather than parameters fitted within the same equations or loss function. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the derivation chain; the analytic variance estimator is derived separately and the result is presented as holding for all parameter values under the stated conditions. This is a standard independent theoretical result.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard asymptotic statistical theory for shrinkage estimators and causal effect estimation; no free parameters or invented entities are introduced in the abstract description.

axioms (2)
  • domain assumption Mild conditions for uniform dominance over OLS
    Invoked to establish the weighted quadratic loss superiority of the proposed estimator.
  • standard math Asymptotic normality supporting analytic variance estimator
    Used to derive the practical variance formula that exhibits acceptable empirical performance.

pith-pipeline@v0.9.0 · 5577 in / 1296 out tokens · 35705 ms · 2026-05-09T23:37:56.159929+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages

  1. [1]

    , author=

    Estimating causal effects of treatments in randomized and nonrandomized studies. , author=. Journal of educational Psychology , volume=. 1974 , publisher=

  2. [2]

    Journal of the American statistical association , volume=

    Randomization analysis of experimental data: The Fisher randomization test comment , author=. Journal of the American statistical association , volume=. 1980 , publisher=

  3. [3]

    Journal of the American statistical Association , volume=

    Statistics and causal inference , author=. Journal of the American statistical Association , volume=. 1986 , publisher=

  4. [4]

    Journal of Statistical Planning and Inference , volume=

    Is regression adjustment supported by the Neyman model for causal inference? , author=. Journal of Statistical Planning and Inference , volume=. 2010 , publisher=

  5. [5]

    The American Statistician , volume=

    Efficiency study of estimators for a treatment effect in a pretest--posttest trial , author=. The American Statistician , volume=. 2001 , publisher=

  6. [6]

    The Annals of Applied Statistics , volume=

    Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique , author=. The Annals of Applied Statistics , volume=. 2013 , publisher=

  7. [7]

    Econometrica: Journal of the Econometric Society , pages=

    Partial time regressions as compared with individual trends , author=. Econometrica: Journal of the Econometric Society , pages=. 1933 , publisher=

  8. [8]

    Econometrica: journal of the Econometric Society , pages=

    A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity , author=. Econometrica: journal of the Econometric Society , pages=. 1980 , publisher=

  9. [9]

    Statistics & Probability Letters , volume=

    On equivalencies between design-based and regression-based variance estimators for randomized experiments , author=. Statistics & Probability Letters , volume=. 2012 , publisher=

  10. [10]

    Journal of economic literature , volume=

    Recent developments in the econometrics of program evaluation , author=. Journal of economic literature , volume=. 2009 , publisher=

  11. [11]

    Draper, N. R. and Smith, H. , year =. Applied Regression Analysis , edition =

  12. [12]

    New England Journal of Medicine , year=

    Tirzepatide as Compared with Semaglutide for the Treatment of Obesity , author=. New England Journal of Medicine , year=

  13. [13]

    New England Journal of Medicine , volume=

    Tirzepatide once weekly for the treatment of obesity , author=. New England Journal of Medicine , volume=. 2022 , publisher=

  14. [14]

    Statistical Review(s), Application Number: 217806Orig1s000 , year =

  15. [15]

    Statistical Review(s), Application Number: 215256Orig1s000 , year =

  16. [16]

    2025 , url =

    Slabodkin, Greg , title =. 2025 , url =

  17. [17]

    New England Journal of Medicine , volume=

    Once-weekly semaglutide in adults with overweight or obesity , author=. New England Journal of Medicine , volume=. 2021 , publisher=

  18. [18]

    Semaglutide 2

    Davies, Melanie and F. Semaglutide 2. The Lancet , volume=. 2021 , publisher=

  19. [19]

    Diabetes, Obesity and Metabolism , volume=

    Tirzepatide 10 and 15 mg compared with semaglutide 2.4 mg for the treatment of obesity: an indirect treatment comparison , author=. Diabetes, Obesity and Metabolism , volume=. 2023 , publisher=

  20. [20]

    International journal of clinical practice , volume=

    Efficacy and Safety of Liraglutide 3.0 mg in Patients with Overweight and Obese with or without Diabetes: A Systematic Review and Meta-Analysis , author=. International journal of clinical practice , volume=. 2022 , publisher=

  21. [21]

    The Lancet Diabetes & Endocrinology , year=

    Sex, race, and BMI in clinical trials of medications for obesity over the past three decades: a systematic review , author=. The Lancet Diabetes & Endocrinology , year=

  22. [22]

    European respiratory journal , volume=

    What is precision medicine? , author=. European respiratory journal , volume=. 2017 , publisher=

  23. [23]

    Trials , volume=

    Assessing and reporting heterogeneity in treatment effects in clinical trials: a proposal , author=. Trials , volume=. 2010 , publisher=

  24. [24]

    JAMA internal medicine , volume=

    Evaluation of evidence of statistical support and corroboration of subgroup claims in randomized clinical trials , author=. JAMA internal medicine , volume=. 2017 , publisher=

  25. [25]

    New England Journal of Medicine , volume=

    Statistics in medicine—reporting of subgroup analyses in clinical trials , author=. New England Journal of Medicine , volume=. 2007 , publisher=

  26. [26]

    Statistics in medicine , volume=

    Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials , author=. Statistics in medicine , volume=. 2017 , publisher=

  27. [27]

    PLoS medicine , volume=

    Comparison of aggregate and individual participant data approaches to meta-analysis of randomised trials: an observational study , author=. PLoS medicine , volume=. 2020 , publisher=

  28. [28]

    Wiley Interdisciplinary Reviews: Computational Statistics , volume=

    Data integration in causal inference , author=. Wiley Interdisciplinary Reviews: Computational Statistics , volume=. 2023 , publisher=

  29. [29]

    Statistical science: a review journal of the Institute of Mathematical Statistics , volume=

    Methods for integrating trials and non-experimental data to examine treatment effect heterogeneity , author=. Statistical science: a review journal of the Institute of Mathematical Statistics , volume=

  30. [30]

    Statistics in medicine , volume=

    Comparison of methods that combine multiple randomized trials to estimate heterogeneous treatment effects , author=. Statistics in medicine , volume=. 2024 , publisher=

  31. [31]

    International Conference on Machine Learning , pages=

    A tree-based model averaging approach for personalized treatment effect estimation from heterogeneous data sources , author=. International Conference on Machine Learning , pages=. 2022 , organization=

  32. [32]

    Biostatistics , volume=

    Aggregate-data estimation of an individual patient data linear random effects meta-analysis with a patient covariate-treatment interaction term , author=. Biostatistics , volume=. 2013 , publisher=

  33. [33]

    Research Synthesis Methods , volume=

    Estimating interactions and subgroup-specific treatment effects in meta-analysis without aggregation bias: A within-trial framework , author=. Research Synthesis Methods , volume=. 2023 , publisher=

  34. [34]

    Journal of Causal Inference , volume=

    Conditional average treatment effect estimation with marginally constrained models , author=. Journal of Causal Inference , volume=. 2023 , publisher=

  35. [35]

    Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics , pages =

    Estimation with Quadratic Loss , author =. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics , pages =. 1961 , publisher =

  36. [36]

    Journal of the American Statistical Association , volume=

    A James-Stein type estimator for combining unbiased and possibly biased estimators , author=. Journal of the American Statistical Association , volume=. 1991 , publisher=

  37. [37]

    Biometrics , volume=

    Improving prediction of linear regression models by integrating external information from heterogeneous populations: James--Stein estimators , author=. Biometrics , volume=. 2024 , publisher=

  38. [38]

    Journal of Multivariate Analysis , volume=

    Minimax estimation of a multivariate normal mean under arbitrary quadratic loss , author=. Journal of Multivariate Analysis , volume=. 1976 , publisher=

  39. [39]

    Technometrics , volume=

    How much does Stein estimation help in multiple linear regression? , author=. Technometrics , volume=. 1986 , publisher=

  40. [40]

    Biometrics , volume=

    Combining observational and experimental datasets using shrinkage estimators , author=. Biometrics , volume=. 2023 , publisher=

  41. [41]

    Journal of Econometrics , volume=

    Efficient shrinkage in parametric models , author=. Journal of Econometrics , volume=. 2016 , publisher=

  42. [42]

    The Annals of Mathematical Statistics , pages=

    A family of minimax estimators of the mean of a multivariate normal distribution , author=. The Annals of Mathematical Statistics , pages=. 1970 , publisher=

  43. [43]

    Statistical Science , pages=

    Shrinkage confidence procedures , author=. Statistical Science , pages=. 2012 , publisher=

  44. [44]

    Journal of the American statistical Association , volume=

    Parametric empirical Bayes inference: theory and applications , author=. Journal of the American statistical Association , volume=. 1983 , publisher=

  45. [45]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Empirical Bayes confidence intervals shrinking both means and variances , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2009 , publisher=

  46. [46]

    confidence intervals for nonparametric empirical Bayes analysis

    Coverage properties of empirical Bayes intervals: a discussion of “confidence intervals for nonparametric empirical Bayes analysis” by ignatiadis and wager , author=. Journal of the American Statistical Association , volume=. 2022 , publisher=

  47. [47]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Small confidence sets for the mean of a spherically symmetric distribution , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2005 , publisher=

  48. [48]

    Journal of the American Statistical Association , volume=

    Empirical Bayes confidence intervals based on bootstrap samples , author=. Journal of the American Statistical Association , volume=. 1987 , publisher=

  49. [49]

    Journal of the American Statistical Association , volume=

    Shrinkage estimators for robust and efficient inference in haplotype-based case-control studies , author=. Journal of the American Statistical Association , volume=. 2009 , publisher=

  50. [50]

    Biometrics , volume=

    Exploiting gene-environment independence for analysis of case--control studies: an empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency , author=. Biometrics , volume=. 2008 , publisher=

  51. [51]

    Statistics in Biosciences , volume=

    Empirical Bayes estimation and prediction using summary-level information from external big data sources adjusting for violations of transportability , author=. Statistics in Biosciences , volume=. 2018 , publisher=

  52. [52]

    Communications in Statistics-Theory and Methods , volume=

    The Frisch--Waugh--Lovell theorem for the lasso and the ridge regression , author=. Communications in Statistics-Theory and Methods , volume=. 2017 , publisher=

  53. [53]

    Technometrics , volume=

    Generalized cross-validation as a method for choosing a good ridge parameter , author=. Technometrics , volume=. 1979 , publisher=

  54. [54]

    2023 , publisher=

    World obesity atlas 2023 , author=. 2023 , publisher=

  55. [55]

    2023 , month = nov, day =

    Jamie Ducharme , title =. 2023 , month = nov, day =

  56. [56]

    Statistical science , volume=

    Causal inference methods for combining randomized trials and observational studies: a review , author=. Statistical science , volume=. 2024 , publisher=

  57. [57]

    the Annals of Statistics , volume=

    Empirical likelihood and general estimating equations , author=. the Annals of Statistics , volume=. 1994 , publisher=

  58. [58]

    The annals of Statistics , pages=

    Estimation of the mean of a multivariate normal distribution , author=. The annals of Statistics , pages=. 1981 , publisher=

  59. [59]

    Economics Letters , volume=

    Estimating the variability of the Stein estimator by bootstrap , author=. Economics Letters , volume=. 1991 , publisher=

  60. [60]

    Econometrica , volume=

    Robust empirical Bayes confidence intervals , author=. Econometrica , volume=. 2022 , publisher=

  61. [61]

    Biostatistics , volume=

    Mediation with External Summary Statistic Information , author=. Biostatistics , volume=. 2025 , publisher=

  62. [62]

    Frontiers of Statistical Decision Making and Bayesian Analysis

    Automated bias-variance trade-off: intuitive inadmissibility or inadmissible intuition , author=. Frontiers of Statistical Decision Making and Bayesian Analysis. Springer, New York , year=