Robust inference for risk heterogeneity under group imbalance
Pith reviewed 2026-06-28 18:16 UTC · model grok-4.3
The pith
Neyman orthogonality produces consistent estimators for risk heterogeneity that tolerate errors in nuisance models under group imbalance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By constructing a Neyman-orthogonal estimator for the risk heterogeneity parameter between two groups, the authors obtain an estimator that remains consistent and asymptotically normal even when nuisance functions are estimated with error; finite-sample simulations confirm reduced bias relative to likelihood-based estimators, and the method detects clinically relevant ethnicity-specific effects on mortality in the eICU data that conventional approaches fail to identify.
What carries the argument
Neyman orthogonality correction applied to the estimating equation for the heterogeneity parameter, which removes first-order sensitivity to nuisance estimation error.
If this is right
- The estimator is consistent for the heterogeneity parameter under group imbalance.
- Asymptotic normality holds after the orthogonality correction.
- Finite-sample bias is substantially lower than in likelihood-based alternatives.
- Inferential stability improves in the reported simulations.
- Ethnicity-specific heterogeneity in admission diagnoses for ICU mortality is detected in the eICU application.
Where Pith is reading between the lines
- The same correction could be applied to heterogeneity parameters defined on other imbalanced observational datasets.
- It points to a general strategy for making subgroup analyses more reliable when baseline models are imperfect.
- High-dimensional or machine-learning nuisance estimators could be substituted while preserving the robustness property.
Load-bearing premise
The baseline risk models can be estimated at rates fast enough for the orthogonality correction to cancel the leading bias term.
What would settle it
A simulation in which the proposed estimator shows bias comparable to standard likelihood estimators when nuisance functions converge at rates slower than required would contradict the claimed first-order insensitivity.
Figures
read the original abstract
Population-level heterogeneity is ubiquitous in biomedical data, where differences across demographic or clinical subgroups can substantially alter risk patterns. For example, in intensive care unit (ICU) studies, the mortality risk associated with specific admission diagnoses can vary across ethnic groups. Existing approaches for detecting risk heterogeneity are often sensitive to baseline model misspecification and regularization bias, both of which commonly arise in practice. In this paper, we propose a robust framework for inferring risk heterogeneity between two populations using Neyman orthogonality, which yields estimators that are locally insensitive to nuisance parameter estimation error. The proposed estimator is consistent and asymptotically normal, and simulation studies demonstrate that in finite samples our method substantially reduces bias and improves inferential stability compared with standard likelihood-based approaches. In an application to the eICU Collaborative Research Database, our method reveals clinically meaningful ethnicity-specific heterogeneity in admission diagnoses for in-hospital mortality that standard likelihood-based methods fail to detect.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a Neyman-orthogonal estimator for inferring risk heterogeneity between two populations that is locally insensitive to nuisance estimation error from baseline risk models. It claims the estimator is consistent and asymptotically normal, with finite-sample simulations showing reduced bias and improved inferential stability relative to standard likelihood-based approaches, and applies the method to the eICU Collaborative Research Database to detect ethnicity-specific heterogeneity in admission diagnoses for in-hospital mortality that standard methods miss.
Significance. If the consistency and asymptotic normality results hold with explicit rate conditions that accommodate group imbalance, the framework would provide a practical advance for subgroup analysis in imbalanced biomedical settings where baseline misspecification is common, enabling more reliable detection of heterogeneity without requiring perfect nuisance estimation.
major comments (2)
- [Abstract] Abstract (paragraph on Neyman orthogonality): the central claim of consistency and asymptotic normality rests on the nuisance estimators (baseline risk models) satisfying rates sufficient for first-order bias cancellation. Under the group imbalance highlighted in the title, the minority population may have effective sample size too small to achieve the o_p(n^{-1/4}) rate, and no imbalance-adjusted bounds or explicit regularity conditions are stated to guarantee the product of nuisance errors vanishes faster than n^{-1/2}.
- [Abstract] Abstract (simulation and application paragraphs): the reported superior finite-sample performance and eICU findings presuppose that the orthogonality correction remains valid in the smaller group; without theoretical verification of the rate condition under imbalance, these empirical claims cannot be taken as confirmation of the method's robustness.
minor comments (1)
- [Abstract] The abstract would benefit from a one-sentence description of the precise form of the proposed estimator (e.g., the orthogonal score or influence function) to clarify how Neyman orthogonality is implemented.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the rate conditions required for our Neyman-orthogonal estimator under group imbalance. We address the two major comments below and will revise the manuscript to incorporate explicit imbalance-adjusted regularity conditions.
read point-by-point responses
-
Referee: [Abstract] Abstract (paragraph on Neyman orthogonality): the central claim of consistency and asymptotic normality rests on the nuisance estimators (baseline risk models) satisfying rates sufficient for first-order bias cancellation. Under the group imbalance highlighted in the title, the minority population may have effective sample size too small to achieve the o_p(n^{-1/4}) rate, and no imbalance-adjusted bounds or explicit regularity conditions are stated to guarantee the product of nuisance errors vanishes faster than n^{-1/2}.
Authors: We agree that the current presentation would benefit from explicit imbalance-adjusted regularity conditions. The asymptotic results in Section 3 are derived under the standard requirement that each group's nuisance estimators converge at o_p(n_g^{-1/4}) where n_g denotes the group-specific sample size; the product of the two nuisance errors is then o_p(n^{-1/2}). In the revision we will add a dedicated remark (or subsection) that states the precise condition n_min^{-1/4} n^{-1/4} = o(n^{-1/2}) (with n_min the minority-group size) together with a brief discussion of how this can be verified in practice when baseline models are estimated separately per group. This addition directly addresses the referee's concern without altering the core claims. revision: yes
-
Referee: [Abstract] Abstract (simulation and application paragraphs): the reported superior finite-sample performance and eICU findings presuppose that the orthogonality correction remains valid in the smaller group; without theoretical verification of the rate condition under imbalance, these empirical claims cannot be taken as confirmation of the method's robustness.
Authors: The simulations already include imbalance ratios up to 10:1 and the eICU analysis uses the observed ethnic-group sizes; both therefore operate under the same finite-sample regime the referee highlights. Nevertheless, we accept that the link between these experiments and the rate condition is not stated explicitly. In the revision we will (i) add a sentence in the simulation section referencing the new imbalance-adjusted bound and (ii) include one additional simulation panel that reports nuisance estimation rates alongside the heterogeneity estimator's performance for the minority group. These changes will make the empirical results a direct verification of the updated theory rather than an implicit one. revision: yes
Circularity Check
No circularity; derivation relies on external Neyman orthogonality
full rationale
The paper's central claims of consistency and asymptotic normality for the proposed estimator rest on Neyman orthogonality as an external technique that cancels first-order bias from nuisance estimation. The abstract and description invoke standard conditions on nuisance rates without defining the target parameter in terms of fitted quantities or reducing the result to a self-citation chain. No self-definitional steps, fitted-input predictions, or load-bearing self-citations are present. The derivation is self-contained against external benchmarks for orthogonal estimation.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard regularity conditions hold that allow the Neyman-orthogonal estimator to be consistent and asymptotically normal when nuisance estimators converge at appropriate rates.
Reference graph
Works this paper leans on
-
[1]
and Johnson, Alistair E
Pollard, Tom J. and Johnson, Alistair E. W. and Raffa, Justin D. , title =. Scientific Data , volume =. 2018 , doi =
2018
-
[2]
Current Opinion in Anesthesiology , volume=
Telemedicine in the ICU: clinical outcomes, economic aspects, and trainee education , author=. Current Opinion in Anesthesiology , volume=. 2019 , publisher=
2019
-
[3]
Critical care medicine , volume=
Acute Physiology and Chronic Health Evaluation (APACHE) IV: hospital mortality assessment for today’s critically ill patients , author=. Critical care medicine , volume=. 2006 , publisher=
2006
-
[4]
NPJ digital medicine , volume=
Developing well-calibrated illness severity scores for decision support in the critically ill , author=. NPJ digital medicine , volume=. 2019 , publisher=
2019
-
[5]
International Journal of General Medicine , pages=
Improving mortality risk prediction with routine clinical data: a practical machine learning model based on eICU patients , author=. International Journal of General Medicine , pages=. 2023 , publisher=
2023
-
[6]
Journal of Clinical Gastroenterology , volume=
Prediction of in-hospital mortality of intensive care unit patients with acute pancreatitis based on an explainable machine learning algorithm , author=. Journal of Clinical Gastroenterology , volume=. 2024 , publisher=
2024
-
[7]
BMJ Open , volume=
Early prediction of in-hospital mortality in patients with congestive heart failure in intensive care unit: a retrospective observational cohort study , author=. BMJ Open , volume=. 2022 , publisher=
2022
-
[8]
BMC Cardiovascular Disorders , volume=
Comparison of machine learning and nomogram to predict 30-day in-hospital mortality in patients with acute myocardial infarction combined with cardiogenic shock: A retrospective study based on the eICU-CRD and MIMIC-IV databases , author=. BMC Cardiovascular Disorders , volume=. 2025 , publisher=
2025
-
[9]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Transfer learning for high-dimensional linear regression: Prediction, estimation and minimax optimality , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2022 , publisher=
2022
-
[10]
Scientific Reports , volume=
Identifying early-measured variables associated with APACHE IVa providing incorrect in-hospital mortality predictions for critical care patients , author=. Scientific Reports , volume=. 2021 , publisher=
2021
-
[11]
Biomedical informatics insights , volume=
Using transfer learning for improved mortality prediction in a data-scarce hospital setting , author=. Biomedical informatics insights , volume=. 2017 , publisher=
2017
-
[12]
Enhancing Inference for Small Cohorts via Transfer Learning and Weighted Integration of Multiple Datasets , author=. arXiv preprint arXiv:2505.07153 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[13]
arXiv preprint arXiv:2501.02128 , year=
Transfer Learning for Individualized Treatment Rules: Application to Sepsis Patients Data from eICU-CRD and MIMIC-III Databases , author=. arXiv preprint arXiv:2501.02128 , year=
-
[14]
Biometrika , volume=
A linear adjustment-based approach to posterior drift in transfer learning , author=. Biometrika , volume=. 2024 , publisher=
2024
-
[15]
Scientific data , volume=
MIMIC-III, a freely accessible critical care database , author=. Scientific data , volume=. 2016 , publisher=
2016
-
[16]
Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages=
Xgboost: A scalable tree boosting system , author=. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages=
-
[17]
PLoS medicine , volume=
UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age , author=. PLoS medicine , volume=. 2015 , publisher=
2015
-
[18]
Advances in large margin classifiers , volume=
Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods , author=. Advances in large margin classifiers , volume=. 1999 , publisher=
1999
-
[19]
Statistics in medicine , volume=
The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models , author=. Statistics in medicine , volume=. 2019 , publisher=
2019
-
[20]
, title =
Steyerberg, Ewout W. , title =. 2019 , publisher =
2019
-
[21]
Common functional principal components
Common functional principal components , author=. arXiv preprint arXiv:0901.4252 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[22]
The Econometrics Journal , volume=
Double/debiased machine learning for treatment and structural parameters , author=. The Econometrics Journal , volume=. 2018 , publisher=
2018
-
[23]
Critical care medicine , volume=
Racial disparities in ICU outcomes: a systematic review , author=. Critical care medicine , volume=. 2022 , publisher=
2022
-
[24]
Pediatric Critical Care Medicine , volume=
Racial/ethnic minority children with cancer experience higher mortality on admission to the ICU in the United States , author=. Pediatric Critical Care Medicine , volume=. 2020 , publisher=
2020
-
[25]
Intensive Care Medicine , volume=
Equity in patient care in the intensive care unit , author=. Intensive Care Medicine , volume=. 2024 , publisher=
2024
-
[26]
JAMA psychiatry , volume=
Evaluation of increases in drug overdose mortality rates in the US by race and ethnicity before and during the COVID-19 pandemic , author=. JAMA psychiatry , volume=. 2022 , publisher=
2022
-
[27]
Journal of the American Heart Association , volume=
Survival differences in Asian and Hispanic patients with in-hospital cardiac arrest , author=. Journal of the American Heart Association , volume=
-
[28]
Heart , volume=
Ethnic differences in sudden cardiac arrest resuscitation , author=. Heart , volume=. 2016 , publisher=
2016
-
[29]
Journal of The American Heart Association , volume=
Comparison of out-of-hospital cardiac arrest outcomes between Asian and White individuals in the United States , author=. Journal of The American Heart Association , volume=
-
[30]
Metabolism , volume=
Admissions for diabetic ketoacidosis in ethnic minority groups in a city hospital , author=. Metabolism , volume=. 2007 , publisher=
2007
-
[31]
Journal of general internal medicine , volume=
Language barriers, physician-patient language concordance, and glycemic control among insured Latinos with diabetes: the Diabetes Study of Northern California (DISTANCE) , author=. Journal of general internal medicine , volume=. 2011 , publisher=
2011
-
[32]
JAMA network open , volume=
Association of neighborhood resources and race and ethnicity with readmissions for diabetic ketoacidosis at US children’s hospitals , author=. JAMA network open , volume=. 2022 , publisher=
2022
-
[33]
BMJ open diabetes research & care , volume=
Association of socioeconomic status and DKA readmission in adults with type 1 diabetes: analysis of the US National Readmission Database , author=. BMJ open diabetes research & care , volume=. 2019 , publisher=
2019
-
[34]
American Economic Review , volume=
Double/debiased/neyman machine learning of treatment effects , author=. American Economic Review , volume=. 2017 , publisher=
2017
-
[35]
International Conference on Machine Learning , volume=
Orthogonal random forest for causal inference , author=. International Conference on Machine Learning , volume=. 2019 , organization=
2019
-
[36]
IEEE Transactions on Knowledge and Data Engineering , volume=
A Survey on Transfer Learning , author=. IEEE Transactions on Knowledge and Data Engineering , volume=
-
[37]
Proceedings of the IEEE , volume=
A comprehensive survey on transfer learning , author=. Proceedings of the IEEE , volume=. 2020 , publisher=
2020
-
[38]
Journal of Big data , volume=
A survey of transfer learning , author=. Journal of Big data , volume=. 2016 , publisher=
2016
-
[39]
Transfer learning in large-scale
Li, Sai and Cai, T Tony and Li, Hongzhe , journal=. Transfer learning in large-scale. 2023 , publisher=
2023
-
[40]
The Annals of Statistics , volume=
Transfer Learning for Nonparametric Classification , author=. The Annals of Statistics , volume=. 2021 , publisher=
2021
-
[41]
Journal of the American Statistical Association , volume=
Transfer learning under high-dimensional generalized linear models , author=. Journal of the American Statistical Association , volume=. 2023 , publisher=
2023
-
[42]
Accommodating time-varying heterogeneity in risk estimation under the
Li, Ziyi and Shen, Yu and Ning, Jing , journal=. Accommodating time-varying heterogeneity in risk estimation under the. 2023 , publisher=
2023
-
[43]
Journal of the American Statistical Association , year =
Chen, Xi , title =. Journal of the American Statistical Association , year =. doi:10.1080/01621459.2023.2210336 , url =
-
[44]
Li, Shulei and Cai, T. Tony and Li, Hongzhe , title =. Journal of the American Statistical Association , year =. doi:10.1080/01621459.2022.2044333 , url =
-
[45]
The Annals of Applied Statistics , volume=
Targeting underrepresented populations in precision medicine: A federated transfer learning approach , author=. The Annals of Applied Statistics , volume=. 2023 , publisher=
2023
-
[46]
Journal of the American Statistical Association , volume=
Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources , author=. Journal of the American Statistical Association , volume=. 2016 , publisher=
2016
-
[47]
European Journal of Epidemiology , volume=
Extending Inferences From a Randomized Trial to a Target Population , author=. European Journal of Epidemiology , volume=
-
[48]
2019 , school=
Neural transfer learning for natural language processing , author=. 2019 , school=
2019
-
[49]
Advances in Neural Information Processing Systems , volume=
How transferable are features in deep neural networks? , author=. Advances in Neural Information Processing Systems , volume=
-
[50]
Clinical cardiology , volume=
Regional variation across the United States in management and outcomes of ST-elevation myocardial infarction: analysis of the 2003 to 2010 nationwide inpatient sample database , author=. Clinical cardiology , volume=. 2014 , publisher=
2003
-
[51]
PLOS digital health , volume=
Generalizability challenges of mortality risk prediction models: A retrospective analysis on a multi-center database , author=. PLOS digital health , volume=. 2022 , publisher=
2022
-
[52]
Scientific reports , volume=
Evaluation of domain generalization and adaptation on improving model robustness to temporal dataset shift in clinical medicine , author=. Scientific reports , volume=. 2022 , publisher=
2022
-
[53]
arXiv preprint arXiv:2507.21783 , year=
Domain Generalization and Adaptation in Intensive Care with Anchor Regression , author=. arXiv preprint arXiv:2507.21783 , year=
-
[54]
Annals of internal medicine , volume=
Assessing the generalizability of prognostic information , author=. Annals of internal medicine , volume=. 1999 , publisher=
1999
-
[55]
Jama , volume=
Geographic distribution of US cohorts used to train deep learning algorithms , author=. Jama , volume=. 2020 , publisher=
2020
-
[56]
Applied clinical informatics , volume=
Systematic review of approaches to preserve machine learning performance in the presence of temporal dataset shift in clinical medicine , author=. Applied clinical informatics , volume=. 2021 , publisher=
2021
-
[57]
NPJ digital medicine , volume=
Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening , author=. NPJ digital medicine , volume=. 2022 , publisher=
2022
-
[58]
Algorithmic Learning Theory , pages=
A generalized neyman-pearson criterion for optimal domain adaptation , author=. Algorithmic Learning Theory , pages=. 2019 , organization=
2019
-
[59]
arXiv preprint arXiv:1903.09734 , year=
Regularized learning for domain adaptation under label shifts , author=. arXiv preprint arXiv:1903.09734 , year=
-
[60]
The Annals of Statistics , volume=
Marginal singularity and the benefits of labels in covariate-shift , author=. The Annals of Statistics , volume=. 2021 , publisher=
2021
-
[61]
Advances in neural information processing systems , volume=
Correcting sample selection bias by unlabeled data , author=. Advances in neural information processing systems , volume=
-
[62]
Advances in neural information processing systems , volume=
Co-regularization based semi-supervised domain adaptation , author=. Advances in neural information processing systems , volume=
-
[63]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Spottune: transfer learning through adaptive fine-tuning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[64]
Seminars in Respiratory and Critical Care Medicine , volume=
Protocol-based care versus individualized management of patients in the intensive care unit , author=. Seminars in Respiratory and Critical Care Medicine , volume=. 2015 , organization=
2015
-
[65]
Econometrica: Journal of the Econometric Society , volume=
The asymptotic variance of semiparametric estimators , author=. Econometrica: Journal of the Econometric Society , volume=. 1994 , publisher=
1994
-
[66]
The Annals of Statistics , volume=
On asymptotically optimal confidence regions and tests for high-dimensional models , author=. The Annals of Statistics , volume=
-
[67]
External validation, model updating, and impact assessment , author=
Risk prediction models: II. External validation, model updating, and impact assessment , author=. Heart , volume=. 2012 , publisher=
2012
-
[68]
Scientific data , volume=
MIMIC-IV, a freely accessible electronic health record dataset , author=. Scientific data , volume=. 2023 , publisher=
2023
-
[69]
Critical care medicine , volume=
A simplified acute physiology score for ICU patients , author=. Critical care medicine , volume=. 1984 , publisher=
1984
-
[70]
Biochemia medica , volume=
Comparing groups for statistical differences: how to choose the right statistical test? , author=. Biochemia medica , volume=. 2010 , publisher=
2010
-
[71]
Frontiers in psychology , volume=
A cautionary note on the use of the Analysis of Covariance (ANCOVA) in classification designs with and without within-subject factors , author=. Frontiers in psychology , volume=. 2015 , publisher=
2015
-
[72]
, author=
Categorical data analysis. , author=. 2013 , journal=
2013
-
[73]
Nursing research , volume=
Multinomial logistic regression , author=. Nursing research , volume=. 2002 , publisher=
2002
-
[74]
Biometrika , volume=
On the existence of maximum likelihood estimates in logistic regression models , author=. Biometrika , volume=. 1984 , publisher=
1984
-
[75]
BMC Medical Research Methodology , volume=
Predictive approaches to heterogeneous treatment effects: a scoping review , author=. BMC Medical Research Methodology , volume=. 2020 , publisher=
2020
-
[76]
Journal of the American Statistical Association , volume=
Semiparametric efficiency in multivariate regression models with missing data , author=. Journal of the American Statistical Association , volume=. 1995 , publisher=
1995
-
[77]
The Annals of Statistics , volume=
Simultaneous analysis of Lasso and Dantzig selector , author=. The Annals of Statistics , volume=
-
[78]
The Annals of Statistics , year=
On asymptotically optimal confidence regions and tests for high-dimensional models , author=. The Annals of Statistics , year=
-
[79]
The Annals of Mathematical Statistics , volume=
On stochastic limit and order relationships , author=. The Annals of Mathematical Statistics , volume=. 1943 , publisher=
1943
-
[80]
The Annals of Statistics , volume=
On the strong universal consistency of nearest neighbor regression function estimates , author=. The Annals of Statistics , volume=. 1994 , publisher=
1994
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.