pith. sign in

arxiv: 2410.02941 · v2 · submitted 2024-10-03 · 📊 stat.ME

Efficient collaborative learning of the average treatment effect

Pith reviewed 2026-05-23 20:00 UTC · model grok-4.3

classification 📊 stat.ME
keywords average treatment effectfederated learningcollaborative learningcausal inferencesemiparametric efficiencymulti-site studieselectronic health recordsdistributional shift
0
0 comments X

The pith

ECO-ATE builds a federated estimator for average treatment effect that reaches the semiparametric efficiency bound while allowing shifts in outcomes, treatments, and covariates across sites.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ECO-ATE as a method for estimating the average treatment effect in multi-site studies that respects data-sharing limits. It combines full individual records from one target population with summary statistics from other source populations to produce an estimator that attains the efficiency bound under suitable conditions. The approach requires no iterative communication between sites and explicitly accommodates differences in the distributions of outcomes, treatments, and baseline covariates. This setup is motivated by the practical constraints of research consortia that want to pool information for real-world evidence without building full data-sharing systems. Simulation results illustrate efficiency improvements and stability under varying shift magnitudes and model complexity.

Core claim

ECO-ATE operates in a federated manner, using individual-level data from a user-defined target population and summary statistics from other source populations, to construct an efficient estimator for the average treatment effect on the target population of interest. The method achieves the semiparametric efficiency bound under appropriate conditions while allowing distributional shifts in outcomes, treatments, and baseline covariates distributions, without requiring iterative communications between sites.

What carries the argument

The ECO-ATE estimator, which fuses target-site individual records with source-site summary statistics to correct for distributional shifts and attain semiparametric efficiency without iterative site-to-site communication.

If this is right

  • Incorporating source summaries yields measurable efficiency gains over target-only estimation.
  • The estimator remains consistent and efficient under a range of distributional shifts and degrees of overparameterization.
  • The single-pass, non-iterative design fits consortia that lack infrastructure for repeated data exchange.
  • The same framework supports real-world evidence studies using electronic health record data from programs such as All of Us.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same summary-statistic correction strategy could be tested on other causal parameters such as conditional average treatment effects if analogous identifying summaries exist.
  • Adoption would lower the barrier to multi-site analyses by removing the need for centralized individual-level repositories.
  • Sequential addition of new source sites could be examined by updating the summary-based correction term without retraining on prior data.
  • Variance reduction relative to target-only estimators can be quantified directly in any simulation that supplies both full target data and partial source summaries.

Load-bearing premise

The summary statistics received from source populations are sufficient to identify and correct for the relevant distributional shifts between sites.

What would settle it

Empirical demonstration that the estimator loses consistency or efficiency when the supplied summaries omit information needed to characterize the outcome, treatment, or covariate shifts between the target and source populations.

Figures

Figures reproduced from arXiv: 2410.02941 by Rui Duan, Sijia Li.

Figure 1
Figure 1. Figure 1: Bias squared, variance and coverage of various estimators. Detailed numbers are provided in [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Flow chart of inclusion and exclusion criteria of the study cohort [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Estimated odds ratio of heart failure and 95% confidence interval comparing non-insulin to [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
read the original abstract

In response to the growing need for generating real-world evidence from multi-site collaborative studies, we introduce an efficient collaborative learning approach to evaluate average treatment effect (ECO-ATE) in a multi-site setting under data sharing constraints. Specifically, ECO-ATE operates in a federated manner, using individual-level data from a user-defined target population and summary statistics from other source populations, to construct efficient estimator for the average treatment effect on the target population of interest. Our federated approach does not require iterative communications between sites, making it particularly suitable for research consortia with limited resources for developing automated data-sharing infrastructures. Compared to existing work data integration methods in causal inference, ECO-ATE allows distributional shifts in outcomes, treatments and baseline covariates distributions, and achieves semiparametric efficiency bound under appropriate conditions. We conduct simulation studies to demonstrate the extent of efficiency gains achieved by incorporating additional data sources, as well as the robustness of our approach against varying levels of distributional shifts and overparameterization, compared to existing benchmarks. We apply ECO-ATE to a case study examining the effect of insulin vs. non-insulin treatments on heart failure for patients with type II diabetes using electronic health record data collected from the All of Us program.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces ECO-ATE, a non-iterative federated estimator for the average treatment effect (ATE) on a target population that uses individual-level data from the target site together with summary statistics received from source sites. It claims to accommodate arbitrary distributional shifts in the outcome, treatment, and covariate distributions while attaining the semiparametric efficiency bound under appropriate conditions, with supporting evidence from simulations and an All of Us EHR application on insulin versus non-insulin treatment effects for heart failure.

Significance. If the efficiency bound is attained with only summary statistics under the stated shifts, the method would constitute a practical advance for multi-site causal inference under data-sharing and privacy constraints, offering efficiency gains without requiring iterative communication or full data pooling.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (method construction): the central claim that finite summary statistics suffice to identify and debias nonparametric shifts in Y, T, and X simultaneously, while still attaining the semiparametric efficiency bound, is load-bearing. Low-dimensional summaries (means, variances, or low-order moments) cannot in general recover the full adjustment functionals needed for arbitrary shifts; the manuscript must explicitly state the summaries employed and provide the identification argument showing they are sufficient for the efficiency result.
  2. [§4] §4 (simulation design): the reported efficiency gains and robustness to distributional shifts and overparameterization are presented without direct numerical comparison to the semiparametric efficiency bound (e.g., variance ratios or asymptotic relative efficiency with standard errors). Without these quantities it is not possible to verify that the bound is achieved rather than merely improved relative to benchmarks.
minor comments (2)
  1. [Abstract] The phrase 'under appropriate conditions' in the abstract should be accompanied by a one-sentence pointer to the precise regularity conditions or theorem number in the main text.
  2. Notation for the target-population ATE and the shift parameters should be introduced once and used consistently; several places appear to reuse symbols for both population quantities and their estimators.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major comment below and agree to revisions that clarify the method and strengthen the simulation evidence.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (method construction): the central claim that finite summary statistics suffice to identify and debias nonparametric shifts in Y, T, and X simultaneously, while still attaining the semiparametric efficiency bound, is load-bearing. Low-dimensional summaries (means, variances, or low-order moments) cannot in general recover the full adjustment functionals needed for arbitrary shifts; the manuscript must explicitly state the summaries employed and provide the identification argument showing they are sufficient for the efficiency result.

    Authors: We appreciate this observation on the load-bearing claim. The ECO-ATE construction in §3 employs specific finite-dimensional summary statistics from source sites, namely the empirical means of the site-specific components of the efficient influence function (including conditional outcome regressions and propensity scores evaluated at target-site covariates). These are sufficient under the paper's semiparametric model for the allowed shifts because the identification of the target ATE relies only on these moments for debiasing, not on recovering the full nonparametric distributions. We will revise the abstract and §3 to state the summaries explicitly and expand the identification argument with the relevant lemmas showing sufficiency for attaining the efficiency bound. revision: yes

  2. Referee: [§4] §4 (simulation design): the reported efficiency gains and robustness to distributional shifts and overparameterization are presented without direct numerical comparison to the semiparametric efficiency bound (e.g., variance ratios or asymptotic relative efficiency with standard errors). Without these quantities it is not possible to verify that the bound is achieved rather than merely improved relative to benchmarks.

    Authors: We agree that explicit numerical comparison to the semiparametric efficiency bound would allow readers to verify attainment rather than relative improvement. In the revised §4 we will add tables reporting the ratio of empirical variance of ECO-ATE to the estimated efficiency bound (with Monte Carlo standard errors) across all simulation settings, as well as asymptotic relative efficiency where closed-form bounds are available. revision: yes

Circularity Check

0 steps flagged

No circularity: estimator construction relies on external target data and stated assumptions

full rationale

The paper introduces ECO-ATE as a federated estimator using individual-level target data plus source summary statistics to target the ATE under distributional shifts. No equations, fitting steps, or self-citations are shown that reduce the claimed efficiency bound or the estimator itself to a quantity defined by its own fitted parameters. The semiparametric efficiency claim is conditioned on the summaries being sufficient for shift correction—an external modeling assumption, not a self-referential definition or fitted-input prediction. The derivation chain is therefore self-contained against the stated inputs and does not exhibit any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, invented entities, or paper-specific axioms are stated. Standard causal identification assumptions are implicitly required but not enumerated.

axioms (1)
  • domain assumption Standard causal assumptions (consistency, no unmeasured confounding, positivity) required for ATE identification in observational data
    Any ATE estimator in observational multi-site data rests on these; the abstract does not list alternatives.

pith-pipeline@v0.9.0 · 5737 in / 1272 out tokens · 33165 ms · 2026-05-23T20:00:39.622255+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    S., Alsuhaibani, H

    Alkhezi, O. S., Alsuhaibani, H. A., Alhadyab, A. A., Alfaifi, M. E., Alomrani, B., Aldossary, A., and Alfayez, O. M. (2021). Heart failure outcomes and glucagon-like peptide-1 receptor agonists: A systematic review of observational studies. Primary Care Diabetes , 15(5):761--771

  2. [2]

    W., and Kang, H

    Athey, S., Chetty, R., Imbens, G. W., and Kang, H. (2019). The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely. Technical report, National Bureau of Economic Research

  3. [3]

    and Pearl, J

    Bareinboim, E. and Pearl, J. (2014). Transportability from multiple environments with limited experiments: Completeness results. Advances in neural information processing systems , 27:280--288

  4. [4]

    J., Klaassen, C

    Bickel, P. J., Klaassen, C. A., Bickel, P. J., Ritov, Y., Klaassen, J., Wellner, J. A., and Ritov, Y. (1993). Efficient and adaptive estimation for semiparametric models , volume 4. Springer

  5. [5]

    L., Chang, T.-H., Nguyen, T

    Brantner, C. L., Chang, T.-H., Nguyen, T. Q., Hong, H., Di Stefano, L., and Stuart, E. A. (2023). Methods for integrating trials and non-experimental data to examine treatment effect heterogeneity. arXiv preprint arXiv:2302.13428

  6. [6]

    Chen, S., Zhang, B., and Ye, T. (2021). Minimax rates and adaptivity in combining experimental and observational data. arXiv preprint arXiv:2109.10522

  7. [7]

    Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters

  8. [8]

    Dahabreh, I. J. and Hern \'a n, M. A. (2019). Extending inferences from a randomized trial to a target population. Eur. J. Epidemiol. , 34(8):719--722

  9. [9]

    J., Petito, L

    Dahabreh, I. J., Petito, L. C., Robertson, S. E., Hern \'a n, M. A., and Steingrimsson, J. A. (2019). Towards causally interpretable meta-analysis: transporting inferences from multiple studies to a target population. arXiv preprint arXiv:1903.11455

  10. [10]

    L., Johnstone, I

    Donoho, D. L., Johnstone, I. M., Kerkyacharian, G., and Picard, D. (1996). Density estimation by wavelet thresholding. The Annals of statistics , pages 508--539

  11. [11]

    B., Xu, H., DeVore, A

    Echouffo-Tcheugui, J. B., Xu, H., DeVore, A. D., Schulte, P. J., Butler, J., Yancy, C. W., Bhatt, D. L., Hernandez, A. F., Heidenreich, P. A., and Fonarow, G. C. (2016). Temporal trends and factors associated with diabetes mellitus among patients hospitalized with heart failure: Findings from get with the guidelines--heart failure registry. American heart...

  12. [12]

    Efron, B. (1978). The geometry of exponential families. The Annals of Statistics , pages 362--376

  13. [13]

    Gilbert, P. B. (2004). Goodness-of-fit tests for semiparametric biased sampling models. Journal of statistical planning and inference , 118(1-2):51--81

  14. [14]

    B., Bosch, R

    Gilbert, P. B., Bosch, R. J., and Hudgens, M. G. (2003). Sensitivity analysis for the assessment of causal vaccine effects on viral load in hiv vaccine trials. Biometrics , 59(3):531--541

  15. [15]

    Grenander, U. (1981). Abstract inference. (No Title)

  16. [16]

    L., Ding, P., Wang, Y., and Jordan, M

    Guo, W., Wang, S. L., Ding, P., Wang, Y., and Jordan, M. (2022). Multi-source causal inference using control variates under outcome selection bias. Transactions on Machine Learning Research

  17. [17]

    Guo, Z., Li, X., Han, L., and Cai, T. (2023). Robust inference for federated meta-learning. arXiv preprint arXiv:2301.00718

  18. [18]

    A., Chute, C

    Haendel, M. A., Chute, C. G., Bennett, T. D., Eichmann, D. A., Guinney, J., Kibbe, W. A., Payne, P. R., Pfaff, E. R., Robinson, P. N., Saltz, J. H., et al. (2021). The national covid cohort collaborative (n3c): rationale, design, infrastructure, and deployment. Journal of the American Medical Informatics Association , 28(3):427--443

  19. [19]

    Han, L., Hou, J., Cho, K., Duan, R., and Cai, T. (2021). Federated adaptive causal estimation (face) of target treatment effects. arXiv preprint arXiv:2112.09313

  20. [20]

    Hastie, T. J. (2017). Generalized additive models. In Statistical models in S , pages 249--307. Routledge

  21. [21]

    and Racine, J

    Hayfield, T. and Racine, J. S. (2008). Nonparametric econometrics: The np package. Journal of statistical software , 27:1--32

  22. [22]

    E., O'Keefe, J

    Herman, M. E., O'Keefe, J. H., Bell, D. S., and Schwartz, S. S. (2017). Insulin therapy increases cardiovascular risk in type 2 diabetes. Progress in cardiovascular diseases , 60(3):422--434

  23. [23]

    and Coupland, C

    Hippisley-Cox, J. and Coupland, C. (2016). Diabetes treatments and risk of heart failure, cardiovascular disease, and all cause mortality: cohort study in primary care. bmj , 354

  24. [24]

    D., Shah, N

    Hripcsak, G., Duke, J. D., Shah, N. H., Reich, C. G., Huser, V., Schuemie, M. J., Suchard, M. A., Park, R. W., Wong, I. C. K., Rijnbeek, P. R., et al. (2015). Observational health data sciences and informatics (ohdsi): opportunities for observational researchers. In MEDINFO 2015: eHealth-enabled Health , pages 574--578. IOS Press

  25. [25]

    E., and Gilbert, P

    Jemiai, Y., Rotnitzky, A., Shepherd, B. E., and Gilbert, P. B. (2007). Semiparametric estimation of treatment effects given base-line covariates on an outcome measured after a post-randomization event occurs. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 69(5):879--901

  26. [26]

    I., Lee, J

    Jordan, M. I., Lee, J. D., and Yang, Y. (2018). Communication-efficient distributed statistical inference. Journal of the American Statistical Association

  27. [27]

    Kallus, N., Saito, Y., and Uehara, M. (2020). Optimal off-policy evaluation from multiple logging policies. arXiv preprint arXiv:2010.11002

  28. [28]

    Kannel, W. B. and McGee, D. L. (1979). Diabetes and cardiovascular disease: the framingham study. Jama , 241(19):2035--2038

  29. [29]

    Kenny, H. C. and Abel, E. D. (2019). Heart failure in type 2 diabetes mellitus: impact of glucose-lowering agents, heart failure therapies, and novel therapeutic strategies. Circulation research , 124(1):121--141

  30. [30]

    Lee, D., Yang, S., Dong, L., Wang, X., Zeng, D., and Cai, J. (2023). Improving trial generalizability using observational studies. Biometrics , 79(2):1213--1225

  31. [31]

    and Marx, N

    Lehrke, M. and Marx, N. (2017). Diabetes mellitus and heart failure. The American journal of cardiology , 120(1):S37--S47

  32. [32]

    Li, S., Cai, T., and Duan, R. (2023a). Targeting underrepresented populations in precision medicine: A federated transfer learning approach. The Annals of Applied Statistics , 17(4):2970--2992

  33. [33]

    T., and Li, H

    Li, S., Cai, T. T., and Li, H. (2022). Transfer learning for high-dimensional linear regression: Prediction, estimation and minimax optimality. Journal of the Royal Statistical Society Series B: Statistical Methodology , 84(1):149--173

  34. [34]

    B., and Luedtke, A

    Li, S., Gilbert, P. B., and Luedtke, A. (2023b). Data fusion using weakly aligned sources. arXiv preprint arXiv:2308.14836

  35. [35]

    and Luedtke, A

    Li, S. and Luedtke, A. (2023). Efficient estimation under data fusion. Biometrika , 110(4):1041--1054

  36. [36]

    Liu, Q., Xu, J., Jiang, R., and Wong, W. H. (2021). Density estimation using deep generative neural networks. Proceedings of the National Academy of Sciences , 118(15):e2101344118

  37. [37]

    Nadaraya, E. A. (1964). On estimating regression. Theory of Probability & Its Applications , 9(1):141--142

  38. [38]

    K., Klein, K., Maggs, D., and Best, J

    Paul, S. K., Klein, K., Maggs, D., and Best, J. H. (2015). The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin with cardiovascular outcomes in patients with type 2 diabetes: a retrospective observational study. Cardiovascular Diabetology , 14:1--9

  39. [39]

    Polley, E. C. and Van Der Laan, M. J. (2010). Super learner in prediction

  40. [40]

    Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika , 70(1):41--55

  41. [41]

    Rubin, D. B. (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American statistical association , 75(371):591--593

  42. [42]

    Rudolph, K. E. and van der Laan, M. J. (2017). Robust estimation of encouragement-design intervention effects transported across sites. J. R. Stat. Soc. , 79(5):1509

  43. [43]

    A., Bradshaw, C

    Stuart, E. A., Bradshaw, C. P., and Leaf, P. J. (2015). Assessing the generalizability of randomized trial results to target populations. Prevention Science , 16(3):475--485

  44. [44]

    Van der Vaart, A. W. (2000). Asymptotic statistics , volume 3. Cambridge university press

  45. [45]

    V., Lee, Y., Hoang, T

    Vo, T. V., Lee, Y., Hoang, T. N., and Leong, T.-Y. (2022). Bayesian federated estimation of causal effects from observational data. In Uncertainty in Artificial Intelligence , pages 2024--2034. PMLR

  46. [46]

    Wang, X., Plantinga, A., Xiong, X., Cromer, S., Bonzel, C.-L., Ayakulangara Panickan, V., Duan, R., Hou, J., and Cai, T. (2024). Comparing insulin vs glp-1, dpp-4, sglt-2 on 5-year incident heart failure for patients with type 2 diabetes mellitus: a real-world evidence study using insurance claims (preprint)

  47. [47]

    M., and Wang, D

    Weiss, K., Khoshgoftaar, T. M., and Wang, D. (2016). A survey of transfer learning. Journal of Big data , 3:1--40

  48. [48]

    T., and Athey, S

    Xiong, R., Koenecke, A., Powell, M., Shen, Z., Vogelstein, J. T., and Athey, S. (2023). Federated causal inference in heterogeneous observational data. Statistics in Medicine , 42(24):4418--4439

  49. [49]

    Yang, S., Gao, C., Zeng, D., and Wang, X. (2023). Elastic integrative analysis of randomised trial and real-world data for treatment heterogeneity estimation. Journal of the Royal Statistical Society Series B: Statistical Methodology , 85(3):575--596

  50. [50]

    Yang, S., Zeng, D., and Wang, X. (2020). Improved inference for heterogeneous treatment effects using real-world data subject to hidden confounding. arXiv preprint arXiv:2007.12922