Probabilistic Win Ratio Method For Hierarchical Composite Endpoints With Coarsened Outcomes

Jing Lei; Lei Li; Yuexiao Dong

arxiv: 2606.07762 · v2 · pith:RPPOZPBWnew · submitted 2026-06-05 · 📊 stat.ME · stat.OT

Probabilistic Win Ratio Method For Hierarchical Composite Endpoints With Coarsened Outcomes

Lei Li , Jing Lei , Yuexiao Dong This is my paper

Pith reviewed 2026-06-27 20:46 UTC · model grok-4.3

classification 📊 stat.ME stat.OT

keywords win ratiocomposite endpointscoarsened datacensoringmissing outcomesclinical trialsprobabilistic methods

0 comments

The pith

The probabilistic win ratio estimates the win ratio for hierarchical endpoints under coarsened observation by using conditional probabilities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the probabilistic win ratio (PWR) to estimate the classical win ratio when some outcomes in hierarchical composite endpoints are coarsened by censoring or missingness. Standard methods treat unresolved comparisons as ties, which can lose efficiency or bias results, especially for lower-priority endpoints. The PWR computes the probability of win, loss, or tie given the observed data for each pair, so that incomplete comparisons contribute only fractionally and with reduced weight according to uncertainty. Fully observed pairs contribute exactly as in the classical estimator, and the method preserves the priority structure. Simulations across various censoring scenarios show the estimator has low bias and mean squared error, with case studies confirming its behavior in real trial data.

Core claim

The PWR framework estimates the win ratio under coarsened observation by replacing deterministic decisions with conditional probabilities of win, loss, or tie given the observed data. Partially observed comparisons contribute fractionally while being penalized by their uncertainty, with greater coarsening leading to smaller effective weight. When all outcomes are fully observed, the PWR reduces exactly to the standard win ratio estimator. This approach maintains low bias and mean squared error in simulations under censoring and missingness, and performs well in clinical trial examples with near-complete and heavily censored data.

What carries the argument

The probabilistic win ratio (PWR), which computes conditional probabilities of win, loss, or tie from coarsened data to weight pairwise comparisons fractionally.

Load-bearing premise

That the conditional probabilities of win, loss, or tie given the observed coarsened data can be calculated in a manner that does not introduce systematic bias.

What would settle it

A dataset with known true win ratio where the PWR estimate shows substantial bias under a specific pattern of right censoring or missingness in lower-priority outcomes.

Figures

Figures reproduced from arXiv: 2606.07762 by Jing Lei, Lei Li, Yuexiao Dong.

**Figure 2.** Figure 2: Mean squared error comparison across missingness and dropout settings. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison across win ratio methods under 50% missingness and 50% dropout. [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

read the original abstract

The win ratio is increasingly used to analyze prioritized composite endpoints in clinical trials, but standard implementations rely on deterministic pairwise comparisons and can perform poorly in the presence of censoring and endpoint-specific missingness. In such settings, unresolved comparisons are often treated as ties, leading to loss of efficiency and potentially biased inference, particularly when lower-priority outcomes are incompletely observed. We propose the probabilistic win ratio (PWR), a framework for estimating the classical win ratio under coarsened observation. The PWR replaces deterministic pairwise decisions with conditional probabilities of win, loss, or tie given the observed data, allowing partially observed comparisons to contribute fractionally while being explicitly penalized according to their uncertainty. Comparisons with greater coarsening receive smaller effective weight, whereas fully observed comparisons contribute as in the classical analysis, preserving the clinical priority structure. When outcomes are fully observed, the PWR reduces exactly to the standard win ratio estimator. Simulation studies show that the PWR maintains low bias and mean squared error across a range of censoring and missingness scenarios. Two clinical trial case studies illustrate complementary data regimes, demonstrating calibration in near-complete data and stability under substantial right censoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable probabilistic fix for win-ratio analyses when some endpoint data are coarsened, and it reduces cleanly to the classical estimator under full observation.

read the letter

The core idea is to replace hard win/loss/tie calls with conditional probabilities computed from the observed coarsened data. That lets partially observed pairs contribute a fractional amount rather than being forced into ties, which is the usual workaround. The reduction property when data are complete is stated explicitly and looks like a useful sanity check.

What stands out is that the approach keeps the original clinical priority ordering intact while down-weighting comparisons that carry more uncertainty. The simulations are described as showing low bias and MSE across censoring and missingness patterns, and the two case studies are meant to cover both near-complete and heavily censored regimes. If those results hold up in the full derivations, this is a practical increment for trials that use hierarchical composites.

The soft spot is the modeling step that produces the conditional probabilities. The abstract says they are computed from the observed data, but any systematic mismatch between the assumed coarsening process and reality could still tilt the estimator. The paper will need to show that the conditionals are either nonparametric or rest on assumptions that are checkable in practice. Without seeing the exact construction, it is hard to judge how sensitive the method is to that choice.

This is aimed at statisticians working on clinical-trial endpoints who already use or review win-ratio methods. It is not a wholesale replacement of the literature but a targeted adjustment for a common data problem. The work is coherent enough on its own terms to merit referee time; an editor should send it out rather than desk-reject.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes the probabilistic win ratio (PWR) for hierarchical composite endpoints under coarsened observations (censoring, endpoint-specific missingness). Deterministic pairwise win/loss/tie decisions are replaced by conditional probabilities computed from the observed data, so that partially observed pairs contribute fractionally with explicit uncertainty penalties; fully observed pairs retain full weight. The PWR is constructed to reduce exactly to the classical win ratio estimator when all outcomes are observed. Simulation studies across censoring and missingness regimes report low bias and MSE, and two clinical-trial case studies illustrate performance under near-complete and heavily censored data.

Significance. If the conditional probabilities are free of systematic bias under standard coarsening assumptions, the method offers a principled way to retain information from incomplete hierarchical comparisons without ad-hoc tie imputation. The exact reduction property and the reported simulation calibration provide concrete anchors. This could improve efficiency in trials using prioritized composites (e.g., cardiovascular or oncology endpoints) where lower-priority components are frequently censored or missing.

major comments (1)

[§2.3] §2.3 (definition of conditional probabilities): the claim that the PWR is unbiased under coarsening relies on the conditional win/loss/tie probabilities being correctly specified from the observed data alone; the manuscript should state explicitly whether this requires a parametric model for the coarsening mechanism or holds nonparametrically, and provide the explicit functional form used in the simulations.

minor comments (2)

[Simulation studies] Table 1 and simulation section: the range of censoring rates and missingness patterns examined should be listed explicitly so readers can judge coverage of realistic trial scenarios.
[Abstract and §2] Notation: the distinction between the classical win ratio W and the PWR estimator ilde{W} is clear in the text but should be reinforced in the abstract and in the first display equation of §2.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation for minor revision. We address the single major comment below.

read point-by-point responses

Referee: [§2.3] §2.3 (definition of conditional probabilities): the claim that the PWR is unbiased under coarsening relies on the conditional win/loss/tie probabilities being correctly specified from the observed data alone; the manuscript should state explicitly whether this requires a parametric model for the coarsening mechanism or holds nonparametrically, and provide the explicit functional form used in the simulations.

Authors: We thank the referee for highlighting the need for greater explicitness on this point. The conditional probabilities are obtained nonparametrically from the observed data under the coarsening-at-random assumption; no parametric model for the coarsening mechanism is required. The explicit functional form is the conditional probability P(win/loss/tie | observed components) defined in §2.3, which is computed from the empirical joint distribution of the observed parts of the hierarchical endpoints (with appropriate nonparametric estimators such as Kaplan–Meier for censored time-to-event components). In the simulation studies the same empirical conditional probabilities were evaluated on the simulated observed data under the independent censoring and missingness mechanisms described in §4. We will revise §2.3 to state the nonparametric character and to display the functional form used in the simulations. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper defines the probabilistic win ratio (PWR) explicitly as a generalization that replaces deterministic pairwise comparisons with conditional probabilities of win/loss/tie given coarsened data. It states that when outcomes are fully observed, PWR reduces exactly to the standard win ratio estimator, providing an external anchor rather than a self-referential definition. No equations are shown that fit parameters to the target quantity or define the estimator in terms of itself. Simulations validate bias/MSE properties but do not constitute the derivation. No self-citation chains, uniqueness theorems, or ansatz smuggling are invoked in the provided claims. The derivation is therefore self-contained against the classical win ratio benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the ability to define and compute conditional win/loss/tie probabilities from coarsened observations; no free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Conditional probabilities of win, loss, or tie can be computed from the observed (coarsened) data without introducing systematic bias.
Central modeling assumption required for the PWR to be unbiased.

pith-pipeline@v0.9.1-grok · 5727 in / 1298 out tokens · 18690 ms · 2026-06-27T20:46:59.661477+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

8 extracted references · 8 canonical work pages

[1]

and Lachin, J

Bebu, I. and Lachin, J. M. (2016). Large-sample inference for a win ratio analysis of a composite outcome based on prioritized components. Biostatistics 17, 178--187. doi:10.1093/biostatistics/kxv032

work page doi:10.1093/biostatistics/kxv032 2016
[2]

and Peron, J

Buyse, M. and Peron, J. (2022). Generalized pairwise comparisons for prioritized outcomes. In S. Piantadosi and C. L. Meinert, editors, Principles and Practice of Clinical Trials, chapter 95, 1869--1893. Springer. doi:10.1007/978-3-319-52636-2\_277

work page doi:10.1007/978-3-319-52636-2 2022
[3]

Dong, G., Mao, L., Huang, B., Gamalo-Siebers, M., Wang, J., Yu, G., and Hoaglin, D. C. (2020). The inverse-probability-of-censoring weighting (IPCW) adjusted win ratio statistic: an unbiased estimator in the presence of independent censoring. Journal of Biopharmaceutical Statistics 30, 882--899. doi:10.1080/10543406.2020.1757692

work page doi:10.1080/10543406.2020.1757692 2020
[4]

B., Folkvaljon, F., Bengtsson, O., Buenconsejo, J., and Koch, G

Gasparyan, S. B., Folkvaljon, F., Bengtsson, O., Buenconsejo, J., and Koch, G. G. (2021). Adjusted win ratio with stratification: calculation methods and interpretation. Statistical Methods in Medical Research 30, 580--611. doi:10.1177/0962280220942558

work page doi:10.1177/0962280220942558 2021
[5]

Lehmann, E. L. (1963). Robust estimation in analysis of variance. The Annals of Mathematical Statistics 34, 957--966. doi:10.1214/aoms/1177704018

work page doi:10.1214/aoms/1177704018 1963
[6]

and Wang, T

Mao, L. and Wang, T. (2021). A class of proportional win-fractions regression models for composite outcomes. Biometrics 77, 1265--1275. doi:10.1111/biom.13382

work page doi:10.1111/biom.13382 2021
[7]

Peron, J., Buyse, M., Ozenne, B., Roche, L., and Roy, P. (2018). An extension of generalized pairwise comparisons for prioritized outcomes in the presence of censoring. Statistical Methods in Medical Research 27, 1230--1239. doi:10.1177/0962280216658320

work page doi:10.1177/0962280216658320 2018
[8]

J., Ariti, C

Pocock, S. J., Ariti, C. A., Collier, T. J., and Wang, D. (2012). The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal 33, 176--182. doi:10.1093/eurheartj/ehr352

work page doi:10.1093/eurheartj/ehr352 2012

[1] [1]

and Lachin, J

Bebu, I. and Lachin, J. M. (2016). Large-sample inference for a win ratio analysis of a composite outcome based on prioritized components. Biostatistics 17, 178--187. doi:10.1093/biostatistics/kxv032

work page doi:10.1093/biostatistics/kxv032 2016

[2] [2]

and Peron, J

Buyse, M. and Peron, J. (2022). Generalized pairwise comparisons for prioritized outcomes. In S. Piantadosi and C. L. Meinert, editors, Principles and Practice of Clinical Trials, chapter 95, 1869--1893. Springer. doi:10.1007/978-3-319-52636-2\_277

work page doi:10.1007/978-3-319-52636-2 2022

[3] [3]

Dong, G., Mao, L., Huang, B., Gamalo-Siebers, M., Wang, J., Yu, G., and Hoaglin, D. C. (2020). The inverse-probability-of-censoring weighting (IPCW) adjusted win ratio statistic: an unbiased estimator in the presence of independent censoring. Journal of Biopharmaceutical Statistics 30, 882--899. doi:10.1080/10543406.2020.1757692

work page doi:10.1080/10543406.2020.1757692 2020

[4] [4]

B., Folkvaljon, F., Bengtsson, O., Buenconsejo, J., and Koch, G

Gasparyan, S. B., Folkvaljon, F., Bengtsson, O., Buenconsejo, J., and Koch, G. G. (2021). Adjusted win ratio with stratification: calculation methods and interpretation. Statistical Methods in Medical Research 30, 580--611. doi:10.1177/0962280220942558

work page doi:10.1177/0962280220942558 2021

[5] [5]

Lehmann, E. L. (1963). Robust estimation in analysis of variance. The Annals of Mathematical Statistics 34, 957--966. doi:10.1214/aoms/1177704018

work page doi:10.1214/aoms/1177704018 1963

[6] [6]

and Wang, T

Mao, L. and Wang, T. (2021). A class of proportional win-fractions regression models for composite outcomes. Biometrics 77, 1265--1275. doi:10.1111/biom.13382

work page doi:10.1111/biom.13382 2021

[7] [7]

Peron, J., Buyse, M., Ozenne, B., Roche, L., and Roy, P. (2018). An extension of generalized pairwise comparisons for prioritized outcomes in the presence of censoring. Statistical Methods in Medical Research 27, 1230--1239. doi:10.1177/0962280216658320

work page doi:10.1177/0962280216658320 2018

[8] [8]

J., Ariti, C

Pocock, S. J., Ariti, C. A., Collier, T. J., and Wang, D. (2012). The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal 33, 176--182. doi:10.1093/eurheartj/ehr352

work page doi:10.1093/eurheartj/ehr352 2012