Quantifying Individual Risk for Binary Outcomes
Pith reviewed 2026-05-24 03:18 UTC · model grok-4.3
The pith
Mild restrictions on the Pearson correlation between potential outcomes produce tighter bounds on the fraction of individuals harmed by treatment.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By invoking mild conditions on the value range of the Pearson correlation coefficient between potential outcomes, improved bounds compared with the Fréchet-Hoeffding bounds are obtained for the fraction negatively affected. Even with a positive CATE the lower bound on FNA can be positive. A nonparametric sensitivity analysis framework is established with the Pearson correlation as the sensitivity parameter, and consistent asymptotically normal estimators for the refined bounds are proposed.
What carries the argument
Refined bounds on the fraction negatively affected that use a restricted but plausible range for the Pearson correlation coefficient between the two potential outcomes as a sensitivity parameter.
If this is right
- Even when the conditional average treatment effect is positive, the lower bound on the fraction negatively affected can remain strictly positive.
- The Pearson correlation coefficient between potential outcomes functions as a tunable sensitivity parameter for bounding individual-level risk.
- Nonparametric estimators of the refined bounds are consistent and asymptotically normal.
- The framework applies directly to observational data under the strong ignorability assumption.
Where Pith is reading between the lines
- Policies that select individuals for treatment solely on the basis of positive CATE may still expose a non-zero fraction of those individuals to harm.
- Domain experts could supply the plausible correlation interval from prior studies or mechanistic knowledge rather than data.
- The same correlation-based tightening could be explored for continuous outcomes or for other individual-level causal functionals.
- Direct comparison against realized individual effects in settings where both outcomes are observed would provide an empirical check on the width of the resulting intervals.
Load-bearing premise
The Pearson correlation between the potential outcomes under treatment and control is restricted to a plausible interval rather than allowed to range over the full [-1,1] interval.
What would settle it
In a randomized experiment where both potential outcomes can be observed for the same units, compute the realized FNA and check whether it lies inside the proposed bounds for every correlation value inside the assumed interval.
Figures
read the original abstract
Understanding treatment effect heterogeneity is crucial for reliable decision-making in treatment evaluation and selection. The conditional average treatment effect (CATE) is widely used to capture treatment effect heterogeneity induced by observed covariates and to design individualized treatment policies. However, it is an average metric within subpopulations, which prevents it from revealing individual risk, potentially leading to misleading results. This article fills this gap by examining individual risk for binary outcomes, specifically focusing on the fraction negatively affected (FNA), a metric that quantifies the percentage of individuals experiencing worse outcomes under treatment compared with control. Even under the strong ignorability assumption, FNA is still unidentifiable, and the existing Fr\'{e}chet--Hoeffding bounds are often too wide and attainable only under extreme data-generating processes. By invoking mild conditions on the value range of the Pearson correlation coefficient between potential outcomes, we obtain improved bounds compared with the Fr\'{e}chet--Hoeffding bounds. We show that paradoxically, even with a positive CATE, the lower bound on FNA can be positive, i.e., in the best-case scenario, many individuals will be harmed if they receive treatment. Additionally, we establish a nonparametric sensitivity analysis framework for FNA using the Pearson correlation coefficient as the sensitivity parameter. Furthermore, we propose nonparametric estimators for the refined FNA bounds and prove their consistency and asymptotic normality. We use simulation to evaluate the performance of the proposed estimators and apply the method to a canonical observational study.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that even under strong ignorability, the fraction negatively affected (FNA) remains unidentifiable and that the Fréchet-Hoeffding bounds are often too wide. By restricting the Pearson correlation ρ between the potential outcomes Y(1) and Y(0) to a 'mild' range treated as a sensitivity parameter, the authors derive narrower bounds on FNA, establish that the lower bound on FNA can remain positive even when CATE > 0, develop a nonparametric sensitivity analysis framework with ρ as the parameter, and propose consistent, asymptotically normal nonparametric estimators for the refined bounds, which are evaluated in simulations and applied to an observational study.
Significance. If the posited range restrictions on ρ can be defended, the work supplies a concrete sensitivity-analysis tool for quantifying individual-level harm in binary-outcome settings that goes beyond CATE and highlights a practically relevant paradox. The nonparametric estimators together with the consistency and asymptotic-normality results constitute a clear methodological contribution.
major comments (2)
- [Abstract / sensitivity framework] Abstract and sensitivity-analysis framework: the central claims of improved bounds and a positive lower bound on FNA despite positive CATE rest on restricting ρ = corr(Y(1),Y(0)) to an unspecified 'mild' interval. No data-driven procedure, empirical calibration, or argument for plausibility of the interval in observational settings is supplied; when ρ lies outside the interval (while still satisfying the marginal-probability constraints), the bounds revert to the Fréchet-Hoeffding width and the paradox disappears.
- [Estimator section] Estimator section: the stated consistency and asymptotic normality of the nonparametric estimators for the refined bounds presuppose that the chosen correlation range is fixed and known; the paper supplies neither the explicit influence-function derivation nor verification that the range restriction does not invalidate the regularity conditions used in the proofs.
minor comments (2)
- Notation for the correlation range and the resulting bound expressions should be introduced with explicit equations rather than descriptive text only.
- The simulation design should report the exact correlation values used to generate data and confirm they lie inside the posited 'mild' interval.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive report. We address each major comment below, indicating where we agree and plan revisions.
read point-by-point responses
-
Referee: [Abstract / sensitivity framework] Abstract and sensitivity-analysis framework: the central claims of improved bounds and a positive lower bound on FNA despite positive CATE rest on restricting ρ = corr(Y(1),Y(0)) to an unspecified 'mild' interval. No data-driven procedure, empirical calibration, or argument for plausibility of the interval in observational settings is supplied; when ρ lies outside the interval (while still satisfying the marginal-probability constraints), the bounds revert to the Fréchet-Hoeffding width and the paradox disappears.
Authors: We agree that the interval for ρ is a user-specified sensitivity parameter rather than data-driven. The framework's purpose is to let analysts assess how FNA bounds vary with different plausible ranges for the correlation between potential outcomes; a data-driven estimator for the interval would change the nature of the analysis. We will add a dedicated subsection providing practical guidance on selecting the range, drawing on domain knowledge and citing empirical literature on correlations between potential outcomes in observational studies. The fact that bounds revert outside the interval is expected and underscores why sensitivity analysis is needed. revision: partial
-
Referee: [Estimator section] Estimator section: the stated consistency and asymptotic normality of the nonparametric estimators for the refined bounds presuppose that the chosen correlation range is fixed and known; the paper supplies neither the explicit influence-function derivation nor verification that the range restriction does not invalidate the regularity conditions used in the proofs.
Authors: The estimators target the bound functionals for a fixed ρ, after which the interval is optimized over; the range restriction is a fixed, deterministic constraint that preserves the regularity conditions. We acknowledge that the main text does not display the explicit influence functions. We will add these derivations to the appendix together with a verification that the maintained assumptions on the propensity score and conditional outcome regressions suffice for the asymptotic results to hold. revision: yes
Circularity Check
No significant circularity; sensitivity analysis uses explicit parameter without reduction to tautology
full rationale
The paper frames its contribution as a nonparametric sensitivity analysis for the fraction negatively affected (FNA), with the Pearson correlation coefficient between potential outcomes Y(1) and Y(0) introduced explicitly as the sensitivity parameter. Improved bounds relative to Fréchet-Hoeffding are obtained by restricting the feasible range of this parameter under 'mild conditions,' which is a direct mathematical consequence of the imposed restrictions rather than any self-definitional or fitted-input reduction. No load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation are described. Estimators are proposed and their consistency/asymptotic normality proven separately. The central claims (refined bounds and the positive lower bound on FNA despite positive CATE) are therefore conditional on the sensitivity parameter and do not collapse to the paper's own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- Pearson correlation range between potential outcomes
axioms (1)
- domain assumption Strong ignorability (no unmeasured confounding)
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
By invoking mild conditions on the value range of the Pearson correlation coefficient between potential outcomes, we obtain improved bounds compared with the Fréchet–Hoeffding bounds.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We establish a nonparametric sensitivity analysis framework for FNA using the Pearson correlation coefficient as the sensitivity parameter.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
Trust Me, I'm a Doctor?
Sharp bounds are derived on the proportion of physicians whose personal strategies perform at least as well as the trial's better average treatment, using nested randomized and observational data from the same population.
-
Trust Me, I'm a Doctor?
Using nested randomized and observational data, the paper derives sharp bounds on the proportion of physicians whose personal strategies perform at least as well as the trial's better-performing treatment.
Reference graph
Works this paper leans on
-
[1]
S. Athey and S. Wager. Policy learning with observational data. Econometrica, 89: 0 133--161, 2021
work page 2021
- [2]
- [3]
-
[4]
Heejung Bang and James M. Robins. Doubly robust estimation in missing data and causal inference models. Biometrics, 61: 0 962--972, 2005
work page 2005
-
[5]
A. Belloni, V. Chernozhukov, I. Fernandez-Val, and C. Hansen. Program evaluation with high-dimensional data. Econometrica, 85: 0 233--298, 2017
work page 2017
-
[6]
Policy learning with asymmetric utilities
Eli Ben-Michael, Kosuke Imai, and Zhichao Jiang. Policy learning with asymmetric utilities. Journal of the American Statistical Association, 119: 0 3045--3058, 2024
work page 2024
-
[7]
Bernard, Jean-Louis Vincent, Pierre-Francois Laterre, Steven P
Gordon R. Bernard, Jean-Louis Vincent, Pierre-Francois Laterre, Steven P. LaRosa, Jean-Francois Dhainaut, Angel Lopez-Rodriguez, Jay S. Steingrub, Gary E. Garber, Jeffrey D. Helterbrand, E. Wesley Ely, and Charles J. Fisher. Efficacy and safety of recombinant human activated protein c for severe sepsis. The New England Journal of Medicine, 344: 0 699--709, 2001
work page 2001
-
[8]
Peter J. Bickel, Chris A.J. Klaassen, Ya'acov Ritov, and Jon A. Wellner. Efficient and Adaptive Estimation for Semiparametric Models. Springer New York, 1993
work page 1993
-
[9]
Causal processes in psychology are heterogeneous
Niall Bolger, Katherine S Zee, Maya Rossignac-Milon, and Ran R Hassin. Causal processes in psychology are heterogeneous. Journal of Experimental Psychology: General, 148: 0 601--618, 2019
work page 2019
-
[10]
Matteo Bonvini and Edward H. Kennedy. Sensitivity analysis via the proportion of unmeasured confounding. Journal of the American Statistical Association, 117: 0 1540--1550, 2022
work page 2022
-
[11]
Robert F. Bordley. The hippocratic oath, effect size, and utility theory. Medical Decision Making, 3: 0 377--379, 2009
work page 2009
-
[12]
Estimating individual treatment effects using non-parametric regression models: A review
Alberto Caron, Gianluca Baio, and Ioanna Manolopoulou. Estimating individual treatment effects using non-parametric regression models: A review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 185: 0 1115--1149, 2022
work page 2022
-
[13]
Bibhas Chakraborty and Erica E. Moodie. Statistical methods for dynamic treatment regimes. Springer, New York, 2013
work page 2013
-
[14]
V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21: 0 1--68, 2018
work page 2018
-
[15]
Dawson, Charles Thomas, Jr Harrell, Frank E., Douglas Wagner, Norman Desbiens, Lee Goldman, Albert W
Jr Connors, Alfred F., Theodore Speroff, Neal V. Dawson, Charles Thomas, Jr Harrell, Frank E., Douglas Wagner, Norman Desbiens, Lee Goldman, Albert W. Wu, Robert M. Califf, Jr Fulkerson, William J., Humberto Vidaillet, Steven Broste, Paul Bellamy, Joanne Lynn, and William A. Knaus. The Effectiveness of Right Heart Catheterization in the Initial Care of Cr...
work page 1996
- [16]
-
[17]
Decomposing treatment effect variation
Peng Ding, Avi Feller, and Luke Miratrix. Decomposing treatment effect variation. Journal of the American Statistical Association, 114: 0 304--317, 2019
work page 2019
-
[18]
Habiba Djebbari and Jeffrey A. Smith. Heterogeneous impacts in progresa. Journal of Econometrics, 145: 0 64--80, 2008
work page 2008
-
[19]
Compliance as an explanatory variable in clinical trials
Bradley Efron and David Feldman. Compliance as an explanatory variable in clinical trials. Journal of the American Statistical Association, 86: 0 9--17, 1991
work page 1991
-
[20]
US Food, Drug Administration, et al. FDA drug safety communication: voluntary market withdrawal of xigris due to failure to show a survival benefit. US Food and Drug Administration, Washington, DC, 2011
work page 2011
-
[21]
Gary L. Gadbury, Hari K. Iyer, and Jeffrey M. Albert. Individual treatment effects in randomized trials with binary outcomes. Journal of Statistical Planning and Inference, 121: 0 163--174, 2004
work page 2004
-
[22]
Jinyong Hahn. On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica, 66: 0 315--331, 1998
work page 1998
-
[23]
Heckman, Jeffrey Smith, and Nancy Clements
James J. Heckman, Jeffrey Smith, and Nancy Clements. Making The Most Out Of Programme Evaluations and Social Experiments: Accounting For Heterogeneity in Programme Impacts . The Review of Economic Studies, 64: 0 487--535, 1997
work page 1997
-
[24]
M.A. Hern \'a n and J. M. Robins. Causal Inference: What If. Boca Raton: Chapman and Hall/CRC, 2020
work page 2020
-
[25]
Keisuke Hirano and Guido W. Imbens. Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and Outcomes Research Methodology, 2: 0 259--278, 2001
work page 2001
-
[26]
Ying Huang, Peter B. Gilbert, and Holly Janes. Assessing treatment-selection markers using a potential outcomes framework. Biometrics, 68: 0 687--696, 2012
work page 2012
-
[27]
Kosuke Imai and Aaron Strauss. Estimation of heterogeneous treatment effects from randomized experiments, with application to the optimal planning of the get-out-the-vote campaign. Political Analysis, 19: 0 1--19, 2011
work page 2011
-
[28]
G. W. Imbens and D. B. Rubin. Causal Inference For Statistics Social and Biomedical Science. Cambridge University Press, 2015
work page 2015
-
[29]
Ying Jin, Zhimei Ren, and Emmanuel J. Cand \`e s. Sensitivity analysis of individual treatment effects: A robust conformal inference approach. Proceedings of the National Academy of Sciences, 120: 0 e2214889120, 2023
work page 2023
-
[30]
What's the harm? sharp bounds on the fraction negatively affected by treatment
Nathan Kallus. What's the harm? sharp bounds on the fraction negatively affected by treatment. arXiv preprint arXiv:2205.10327, 2022
-
[31]
Edward H. Kennedy. Nonparametric causal effects based on incremental propensity score interventions. The Annals of Statistics, 114: 0 645--656, 2019
work page 2019
-
[32]
Edward H. Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics, 17: 0 3008--3049, 2023 a
work page 2023
- [33]
-
[34]
Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects
David M Kent, Ewout Steyerberg, and David van Klaveren. Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects. The British Medical Journal, 363: 0 k4245, 2018
work page 2018
-
[35]
T. Kitagawa and A. Tetenov. Who should be treated? empirical welfare maximization methods for treatment choice. Econometrica, 86: 0 591--616, 2018
work page 2018
-
[36]
Michael R. Kosorok and Eric B. Laber. Precision medicine. Annual Review of Statistics and Its Application, 6: 0 263--86, 2019
work page 2019
-
[37]
Kubiak, Agnieszka Ciarka, Monika Biniecka, and Piotr Ceranowicz
Grzegorz M. Kubiak, Agnieszka Ciarka, Monika Biniecka, and Piotr Ceranowicz. Right heart catheterization-background, physiological basics, and clinical implications. Journal of Clinical Medicine, 8: 0 1331, 2019
work page 2019
-
[38]
Lihua Lei and Emmanuel J. Cand \`e s. Conformal inference of counterfactuals and individual treatment effects. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 83: 0 911--938, 2021
work page 2021
-
[39]
Qi Li, Jeffrey S. Racine, and Jeffrey M. Wooldridge. Estimating average treatment effects with continuous and discrete covariates: The case of swan-ganz catheterization. The American Economic Review, 98: 0 357--362, 2008
work page 2008
-
[40]
Alexander R. Luedtke and Mark J. van der Laan. Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. The Annals of Statistics, 44: 0 713--742, 2016
work page 2016
-
[41]
Perspective on `harm' in personalized medicine -- an alternative perspective
Scott Mueller and Judea Pearl. Perspective on `harm' in personalized medicine -- an alternative perspective. American Journal of Epidemiology, Forthcoming, 2023 a
work page 2023
-
[42]
Personalized decision making -- a conceptual introduction
Scott Mueller and Judea Pearl. Personalized decision making -- a conceptual introduction. Journal of Causal Inference, 11: 0 20220050, 2023 b
work page 2023
-
[43]
Optimal dynamic treatment regimes
Susan A Murphy. Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65: 0 331--355, 2003
work page 2003
-
[44]
On the application of probability theory to agricultural experiments
Jerzy Splawa Neyman. On the application of probability theory to agricultural experiments. essay on principles. section 9. Statistical Science, 5: 0 465--472, 1990
work page 1990
-
[45]
Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell. Causal inference in statistics: A primer. Wiley, 2016
work page 2016
-
[46]
D. B. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational psychology, 66: 0 688--701, 1974
work page 1974
-
[47]
Monica R Shah, Vic Hasselblad, Lynne W Stevenson, Cynthia Binanay, Christopher M O'Connor, George Sopko, and Robert M Califf. Impact of the pulmonary artery catheter in critically ill patients: Meta-analysis of randomized clinical trials. Journal of the American Medical Association, 294: 0 1664--1670, 2005
work page 2005
-
[48]
Treatment benefit and treatment harm rate to characterize heterogeneity in treatment effect
Changyu Shen, Jaesik Jeong, Xiaochun Li, Peng-Sheng Chen, and Alfred Buxton. Treatment benefit and treatment harm rate to characterize heterogeneity in treatment effect. Biometrics, 69: 0 724--731, 2013
work page 2013
-
[49]
Assessing the use of activated protein c in the treatment of severe sepsis
Jay P Siegel. Assessing the use of activated protein c in the treatment of severe sepsis. The New England Journal of Medicine, 347: 0 1030--1034, 2002
work page 2002
-
[50]
A distributional approach for causal inference using propensity scores
Zhiqiang Tan. A distributional approach for causal inference using propensity scores. Journal of the American Statistical Association, 101: 0 1619--1637, 2006
work page 2006
-
[51]
Targeted learning of the mean outcome under an optimal dynamic treatment rule
Mark J van der Laan and Alexander R Luedtke. Targeted learning of the mean outcome under an optimal dynamic treatment rule. Journal of Causal Inference, 3: 0 61--95, 2015
work page 2015
-
[52]
On model selection and model misspecification in causal inference
Stijn Vansteelandt, Maarten Bekaert, and Gerda Claeskens. On model selection and model misspecification in causal inference. Statistical Methods in Medical Research, 21: 0 7--30, 2012
work page 2012
-
[53]
Estimation and inference of heterogeneous treatment effects using random forests
Stefan Wager and Susan Athey. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113: 0 1228--1242, 2018
work page 2018
-
[54]
Using secondary outcome to sharpen bounds for treatment harm rate in characterizing heterogeneity
Yunjian Yin, Zheng Cai, and Xiao-Hua Zhou. Using secondary outcome to sharpen bounds for treatment harm rate in characterizing heterogeneity. Biometrical Journal, 60: 0 879--892, 2018 a
work page 2018
-
[55]
Assessing the treatment effect heterogeneity with a latent variable
Yunjian Yin, Lan Liu, and Zhi Geng. Assessing the treatment effect heterogeneity with a latent variable. Statistica Sinica, 28: 0 115--135, 2018 b
work page 2018
-
[56]
Assessing the heterogeneity of treatment effects via potential outcomes of individual patients
Zhiwei Zhang, Chenguang Wang, Lei Nie, and Guoxing Soon. Assessing the heterogeneity of treatment effects via potential outcomes of individual patients. Journal of the Royal Statistical Society: Series C (Applied Statistics), 62: 0 687--704, 2013
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.