pith. sign in

arxiv: 2605.20125 · v1 · pith:KF5H3S3Ynew · submitted 2026-05-19 · 📊 stat.ME · math.ST· stat.TH

Federated Learning with Incomplete Data: When to Use Complete Cases and When to Weight

Pith reviewed 2026-05-20 03:26 UTC · model grok-4.3

classification 📊 stat.ME math.STstat.TH
keywords federated learningmissing datacomplete case analysisinverse probability weightingmulti-site studiescalibrated estimationsandwich variance
0
0 comments X

The pith

In federated learning with missing data, complete-case analysis is preferred over inverse-probability weighting when site-level conditions hold, and a calibrated method combines weights across sites to stay consistent if at least one model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework for analyzing incomplete data across multiple sites without sharing individual records. It identifies concrete conditions under which the simple complete-case estimator outperforms inverse-probability weighting, and supplies a new calibrated weighting procedure that pools candidate models from different sites while remaining consistent provided any one of those models is correctly specified. Consistency requirements are imposed only locally at each site, so the overall federated estimator inherits validity from the participating locations. A sandwich variance estimator is derived that incorporates uncertainty from the weight calibration step. The approach is demonstrated on risk-factor analysis for mortality in patients with pleural infections.

Core claim

The central claim is that, under stated site-level consistency conditions, the complete-case estimator is preferred to the inverse-probability-weighted estimator in federated settings; when complete-case analysis is invalid, a calibrated estimator that aggregates candidate weighting models across sites remains consistent whenever at least one candidate model is correctly specified, with validity inherited from the local properties and with a sandwich variance that accounts for weight-estimation uncertainty.

What carries the argument

Calibrated weight estimation that combines candidate weighting models across sites while remaining consistent if at least one is correctly specified.

If this is right

  • The federated estimator is consistent whenever the local complete-case or local weighting estimators satisfy the stated site-level conditions.
  • A sandwich variance formula correctly accounts for the extra variability introduced by estimating the weights.
  • The method can be applied directly to multi-site medical studies that must respect privacy constraints while handling missing covariates or outcomes.
  • When complete-case analysis is biased, the calibrated weighting procedure recovers consistent estimates without requiring a single correctly specified model at every site.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same site-level inheritance logic could be tested in other privacy-preserving settings such as differential privacy or secure multi-party computation.
  • Extending the framework to time-to-event or longitudinal outcomes would require only replacing the local estimating equations while preserving the aggregation and calibration steps.
  • Empirical checks could compare the calibrated estimator against oracle pooled analysis on de-identified benchmark datasets to quantify efficiency loss from federation.

Load-bearing premise

The federated estimator inherits validity from site-level consistency conditions, so that local properties determine the overall result when data are aggregated without sharing.

What would settle it

A simulation or real multi-site dataset in which every candidate weighting model at every site is misspecified yet the federated calibrated estimator still converges to the true parameter would falsify the consistency claim.

Figures

Figures reproduced from arXiv: 2605.20125 by Chad Hochberg, Elizabeth A. Stuart, Jason Akulian, Jesus E. Vazquez, Jiayi Tong, Theodore J. Iwashyna, Yicheng Shen.

Figure 1
Figure 1. Figure 1: Oracle denotes the case of no missingness. IPW (oracle) denotes the case when the true probability of a complete observation is used (no estimation). IPW (pooled) denotes weights estimates obtained using pooled data. IPW (site-specific) denotes weights obtained using only site-specific data. IPW (calibrated) denotes weights were calibrated using external weighting models. estimator exhibited bias regardles… view at source ↗
read the original abstract

Privacy constraints have driven the rise of federated learning (FL), which enables multi-site analyses without sharing individual participant data. We develop a framework for FL with missing data, identifying conditions under which the complete case (CC) estimator is preferred over the inverse probability weighting (IPW) estimator. For settings where the CC estimator fails, we introduce a calibrated weight estimation approach that combines candidate weighting models across sites and remains consistent if at least one is correctly specified. Consistency conditions are stated at the site level, ensuring that the federated estimator inherits validity from local properties. We derive a sandwich variance estimator that accounts for uncertainty in weight estimation, and illustrate the framework by evaluating risk factors for 90-day mortality among patients with pleural infections treated with intrapleural enzyme therapy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a framework for federated learning with missing data. It identifies conditions under which the complete-case (CC) estimator is preferred over inverse-probability weighting (IPW). For settings where CC fails, it introduces a calibrated weight estimation procedure that pools candidate propensity models across sites and claims consistency provided at least one model is correctly specified. Consistency conditions are asserted at the site level so that the federated estimator inherits local validity. A sandwich variance estimator that accounts for weight-estimation uncertainty is derived, and the approach is illustrated on a multi-site analysis of 90-day mortality risk factors among patients with pleural infections treated by intrapleural enzyme therapy.

Significance. If the central consistency and variance results hold, the work would supply a practical, privacy-preserving method for handling missing data in distributed medical studies. The site-level consistency framing and the sandwich estimator are potentially useful contributions, and the empirical illustration demonstrates applicability. The calibrated aggregation step, however, is the load-bearing component whose validity must be verified before the framework can be recommended for general use.

major comments (2)
  1. [§3] §3 (calibrated weight estimation): The manuscript states that consistency conditions are given at the site level and that the federated estimator therefore inherits validity from local properties. The calibrated procedure combines candidate weighting models across sites while claiming consistency if at least one is correctly specified. No explicit aggregation formula or proof sketch is supplied showing that the union-model consistency property is preserved when only a subset of sites contain a correct propensity model. This step is load-bearing for the central claim; a counter-example or detailed proof of the transfer is required.
  2. [§4] §4 (sandwich variance): The sandwich variance is asserted to account for uncertainty in weight estimation. It is not shown how the estimator incorporates the additional variability induced by the cross-site calibration of the weights. Without this accounting, the reported standard errors may be invalid and the coverage properties of the resulting confidence intervals cannot be guaranteed.
minor comments (2)
  1. [Abstract] Abstract: The conditions under which the CC estimator is preferred to IPW are mentioned but not stated explicitly; a one-sentence summary of those conditions would improve readability.
  2. [Empirical illustration] The empirical illustration would benefit from a brief description of the number of participating sites, the observed missingness rate, and the candidate propensity models that were combined.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major comment below and will revise the manuscript to supply the requested details on the aggregation formula, consistency proof, and variance derivation.

read point-by-point responses
  1. Referee: [§3] §3 (calibrated weight estimation): The manuscript states that consistency conditions are given at the site level and that the federated estimator therefore inherits validity from local properties. The calibrated procedure combines candidate weighting models across sites while claiming consistency if at least one is correctly specified. No explicit aggregation formula or proof sketch is supplied showing that the union-model consistency property is preserved when only a subset of sites contain a correct propensity model. This step is load-bearing for the central claim; a counter-example or detailed proof of the transfer is required.

    Authors: We thank the referee for highlighting this gap. The current manuscript asserts site-level consistency but does not supply an explicit aggregation formula or proof sketch for the federated case when only a subset of sites contain a correct propensity model. In the revision we will add the aggregation formula for the calibrated weights and a detailed proof that the union-model consistency property transfers to the federated estimator under the stated conditions. We will also include a brief illustrative example clarifying the role of the subset of correct sites. revision: yes

  2. Referee: [§4] §4 (sandwich variance): The sandwich variance is asserted to account for uncertainty in weight estimation. It is not shown how the estimator incorporates the additional variability induced by the cross-site calibration of the weights. Without this accounting, the reported standard errors may be invalid and the coverage properties of the resulting confidence intervals cannot be guaranteed.

    Authors: We agree that the current derivation does not explicitly show how the sandwich variance accounts for variability induced by the cross-site calibration step. In the revised manuscript we will expand the variance section to derive the additional terms arising from the calibration procedure and update the sandwich formula accordingly, ensuring all sources of weight-estimation uncertainty are incorporated. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on standard estimators with independent consistency claims

full rationale

The paper introduces a framework combining complete-case and inverse-probability-weighted estimators under federated constraints, then proposes a calibrated multi-model weighting procedure whose consistency is asserted to hold if at least one candidate model is correct. These claims rest on site-level consistency assumptions that are stated as external conditions rather than derived from the federated aggregation itself. No equation or step reduces a target quantity to a fitted parameter or self-citation by construction; the sandwich variance and inheritance statements are presented as derived consequences of the local properties rather than tautological re-labelings. The analysis therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard missing-data and federated-learning background assumptions plus the key domain assumption that site-level consistency transfers to the federated estimator; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Consistency conditions stated at the site level ensure the federated estimator inherits validity from local properties.
    Directly invoked in the abstract to justify the overall framework validity.

pith-pipeline@v0.9.0 · 5684 in / 1442 out tokens · 53945 ms · 2026-05-20T03:26:35.959443+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages

  1. [1]

    Why” behind including “Y

    The “Why” behind including “Y” in your imputation model , author=. Statistical Methods in Medical Research , volume=. 2024 , publisher=

  2. [2]

    The American Statistician , volume=

    Understanding the implications of a complete case analysis for regression models with a right-censored covariate , author=. The American Statistician , volume=. 2024 , publisher=

  3. [3]

    Multiple imputation for multilevel data with continuous and binary variables , author=

  4. [4]

    and Carpenter, James R

    Bartlett, Jonathan W. and Carpenter, James R. and Tilling, Kate and Vansteelandt, Stijn , journal=. Improving upon the efficiency of complete case analysis when covariates are. 2014 , publisher=

  5. [5]

    , author=

    The moderator--mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. , author=. Journal of personality and social psychology , volume=. 1986 , publisher=

  6. [6]

    Sociological Methods & Research , volume=

    Using inverse probability weighting to address post-outcome collider bias , author=. Sociological Methods & Research , volume=. 2024 , publisher=

  7. [7]

    Cancer Discovery , volume=

    Federated deep learning enables cancer subtyping by proteomics , author=. Cancer Discovery , volume=. 2025 , publisher=

  8. [8]

    Nature Communications , volume =

    Multiple imputation for analysis of incomplete data in distributed health data networks , author =. Nature Communications , volume =. 2020 , doi =

  9. [9]

    Statistica Sinica , pages=

    A split-and-conquer approach for analysis of extraordinarily large data , author=. Statistica Sinica , pages=. 2014 , publisher=

  10. [10]

    Computational statistics & data analysis , volume=

    A unified framework of multiply robust estimation approaches for handling incomplete data , author=. Computational statistics & data analysis , volume=. 2023 , publisher=

  11. [11]

    Journal of Causal Inference , volume=

    Causal effect on a target population: A sensitivity analysis to handle missing covariates , author=. Journal of Causal Inference , volume=. 2022 , publisher=

  12. [12]

    Statistics in medicine , volume=

    Developing more generalizable prediction models from pooled studies and large clustered data sets , author=. Statistics in medicine , volume=. 2021 , publisher=

  13. [13]

    Leverage real-world longitudinal data in large clinical research networks for

    Duan, Rui and Chen, Zhaoyi and Tong, Jiayi and Luo, Chongliang and Lyu, Tianchen and Tao, Cui and Maraganore, Demetrius and Bian, Jiang and Chen, Yong , booktitle=. Leverage real-world longitudinal data in large clinical research networks for

  14. [14]

    Heckman imputation models for binary or continuous

    Galimard, Jacques-Emmanuel and Chevret, Sylvie and Curis, Emmanuel and Resche-Rigon, Matthieu , journal=. Heckman imputation models for binary or continuous. 2018 , publisher=

  15. [15]

    Journal of the American Statistical Association , volume=

    Multiply robust estimation in regression analysis with missing data , author=. Journal of the American Statistical Association , volume=. 2014 , publisher=

  16. [16]

    Danish Medical Journal , volume=

    Validation of the RAPID score in a Danish population with pleural infection , author=. Danish Medical Journal , volume=. 2024 , publisher=

  17. [17]

    A general framework for imputation in surveys , journal =

    Haziza, David and Beaumont, Jean-Fran. A general framework for imputation in surveys , journal =. 2017 , volume =

  18. [18]

    The annals of statistics , pages=

    Ignorability and coarse data , author=. The annals of statistics , pages=. 1991 , publisher=

  19. [19]

    International Journal of Epidemiology , volume=

    Accounting for missing data in statistical analyses: multiple imputation is not always the answer , author=. International Journal of Epidemiology , volume=. 2019 , publisher=

  20. [20]

    Statistics in medicine , volume=

    Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE , author=. Statistics in medicine , volume=. 2015 , publisher=

  21. [21]

    Research Synthesis Methods , pages=

    Hierarchical imputation of categorical variables in the presence of systematically and sporadically missing data , author=. Research Synthesis Methods , pages=. 2025 , publisher=

  22. [22]

    Journal of the American Statistical Association , year=

    Communication-efficient distributed statistical inference , author=. Journal of the American Statistical Association , year=

  23. [23]

    NPJ digital medicine , volume=

    Digital twins for health: a scoping review , author=. NPJ digital medicine , volume=. 2024 , publisher=

  24. [24]

    Foundations and trends

    Advances and open problems in federated learning , author=. Foundations and trends. 2021 , publisher=

  25. [25]

    , title =

    Kim, Jae Kwang and Fuller, Wayne A. , title =. Biometrika , year =

  26. [26]

    A doubly robust framework for addressing outcome-dependent selection bias in multi-cohort

    Kundu, Ritoban and Shi, Xu and Kleinsasser, Michael and Fritsche, Lars G and Salvatore, Maxwell and Mukherjee, Bhramar , journal=. A doubly robust framework for addressing outcome-dependent selection bias in multi-cohort. 2026 , publisher=

  27. [27]

    Journal of Machine Learning Research , volume=

    Communication-efficient sparse regression , author=. Journal of Machine Learning Research , volume=

  28. [28]

    Biometrika , volume=

    Demystifying a class of multiply robust estimators , author=. Biometrika , volume=. 2020 , publisher=

  29. [29]

    Journal of Biomedical Informatics , volume=

    FedScore: A privacy-preserving framework for federated scoring system development , author=. Journal of Biomedical Informatics , volume=. 2023 , publisher=

  30. [30]

    Journal of the American Medical Informatics Association , volume=

    Federated and distributed learning applications for electronic health records and structured medical data: a scoping review , author=. Journal of the American Medical Informatics Association , volume=. 2023 , publisher=

  31. [31]

    Annual review of biomedical data science , volume=

    Centralized and federated models for the analysis of clinical data , author=. Annual review of biomedical data science , volume=. 2024 , publisher=

  32. [32]

    Journal of Biomedical Informatics , volume=

    FedIMPUTE: Privacy-preserving missing value imputation for multi-site heterogeneous electronic health records , author=. Journal of Biomedical Informatics , volume=. 2025 , publisher=

  33. [33]

    AMIA Annual Symposium Proceedings , volume=

    Federated multiple imputation for variables that are missing not at random in distributed electronic health records , author=. AMIA Annual Symposium Proceedings , volume=

  34. [34]

    medRxiv , pages=

    D3MI: an efficient and powerful federated imputation method for bias reduction in the analysis of distributed incomplete data by accounting for within-site correlation and between-site heterogeneity , author=. medRxiv , pages=. 2025 , publisher=

  35. [35]

    Journal of the American Statistical Association , volume=

    Pattern-mixture models for multivariate incomplete data , author=. Journal of the American Statistical Association , volume=. 1993 , publisher=

  36. [36]

    arXiv preprint arXiv:2403.05229 , year=

    Developing federated time-to-event scores using heterogeneous real-world survival data , author=. arXiv preprint arXiv:2403.05229 , year=

  37. [37]

    Journal of the American Statistical association , volume=

    Regression with missing X's: a review , author=. Journal of the American Statistical association , volume=. 1992 , publisher=

  38. [38]

    Sociological Methods & Research , volume=

    A comparison of three popular methods for handling missing data: complete-case analysis, inverse probability weighting, and multiple imputation , author=. Sociological Methods & Research , volume=. 2024 , publisher=

  39. [39]

    Frontiers in Digital Health , volume=

    Technical and legal aspects of federated learning in bioinformatics: applications, challenges and opportunities , author=. Frontiers in Digital Health , volume=. 2025 , publisher=

  40. [40]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Every missingness not at random model has a missingness at random counterpart with equal fit , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2008 , publisher=

  41. [41]

    American journal of epidemiology , volume=

    Canonical causal diagrams to guide the treatment of missing data in epidemiologic studies , author=. American journal of epidemiology , volume=. 2018 , publisher=

  42. [42]

    Journal of the American Statistical Association , volume=

    Graphical models for processing missing data , author=. Journal of the American Statistical Association , volume=. 2021 , publisher=

  43. [43]

    Statistical methods in medical research , volume=

    Multiple imputation by chained equations for systematically and sporadically missing multilevel data , author=. Statistical methods in medical research , volume=. 2018 , publisher=

  44. [44]

    Research Data Assistance Center , year =

  45. [45]

    Journal of the American statistical Association , volume=

    Estimation of regression coefficients when some regressors are not always observed , author=. Journal of the American statistical Association , volume=. 1994 , publisher=

  46. [46]

    and Rotnitzky, Andrea and Zhao, Lue Ping , title =

    Robins, James M. and Rotnitzky, Andrea and Zhao, Lue Ping , title =. Journal of the American Statistical Association , volume =

  47. [47]

    Statistical methods in medical research , volume=

    Review of inverse probability weighting for dealing with missing data , author=. Statistical methods in medical research , volume=. 2013 , publisher=

  48. [48]

    Frontiers in Psychiatry , volume=

    Digital twins and the future of precision mental health , author=. Frontiers in Psychiatry , volume=. 2023 , publisher=

  49. [49]

    Bioinformatics , volume=

    MissForest—non-parametric missing value imputation for mixed-type data , author=. Bioinformatics , volume=. 2012 , publisher=

  50. [50]

    2006 , publisher=

    Semiparametric theory and missing data , author=. 2006 , publisher=

  51. [51]

    Wiley Interdisciplinary Reviews: Computational Statistics , volume=

    Review of Simulation Studies Evaluating Imputation Methods in High-Dimensional Datasets , author=. Wiley Interdisciplinary Reviews: Computational Statistics , volume=. 2025 , publisher=

  52. [52]

    arXiv preprint arXiv:2409.04684 , year=

    Establishing the Parallels and Differences Between Right-Censored and Missing Covariates , author=. arXiv preprint arXiv:2409.04684 , year=

  53. [53]

    and Henry, Christopher and Stock, Eileen M

    White, Heath D. and Henry, Christopher and Stock, Eileen M. and Arroliga, Alejandro C. and Ghamande, Shekhar , journal=. Predicting long-term outcomes in pleural infections. 2015 , publisher=

  54. [54]

    2025 , publisher=

    Wu, Qiong and Reps, Jenna M and Li, Lu and Zhang, Bingyu and Lu, Yiwen and Tong, Jiayi and Zhang, Dazheng and Lumley, Thomas and Brand, Milou T and Van Zandt, Mui and others , journal=. 2025 , publisher=

  55. [55]

    npj Digital Medicine , volume=

    Unlocking efficiency in real-world collaborative studies: a multi-site international study with one-shot lossless GLMM algorithm , author=. npj Digital Medicine , volume=. 2025 , publisher=

  56. [56]

    Journal of the American Medical Informatics Association , volume=

    Managing re-identification risks while providing access to the All of Us research program , author=. Journal of the American Medical Informatics Association , volume=. 2023 , publisher=

  57. [57]

    Journal of Healthcare Informatics Research , volume=

    Federated learning for healthcare informatics , author=. Journal of Healthcare Informatics Research , volume=. 2021 , publisher=

  58. [58]

    PloS one , volume=

    A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data , author=. PloS one , volume=. 2023 , publisher=

  59. [59]

    arXiv preprint arXiv:2508.01615 , year=

    TCDiff: Triplex Cascaded Diffusion for High-fidelity Multimodal EHRs Generation with Incomplete Clinical Data , author=. arXiv preprint arXiv:2508.01615 , year=

  60. [60]

    ACM Transactions on Intelligent Systems and Technology , year=

    Federated inverse probability treatment weighting for individual treatment effect estimation , author=. ACM Transactions on Intelligent Systems and Technology , year=

  61. [61]

    Knowledge-Based Systems , volume=

    Federated conditional generative adversarial nets imputation method for air quality missing data , author=. Knowledge-Based Systems , volume=. 2021 , publisher=