On Cluster Randomized Trials with the Desirability of Outcome Ranking (DOOR) Endpoints
Pith reviewed 2026-05-08 02:12 UTC · model grok-4.3
The pith
New methods extend the Desirability of Outcome Ranking framework to cluster randomized trials using U-statistics and influence functions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a suite of new methods to extend DOOR to cluster trials based on properties of U-statistics and influence functions to estimate within-cluster and between-cluster treatment effects. These approaches can be applied in different scenarios, including mixtures of clusters with two treatment groups and clusters with only one group, and both small and large numbers of clusters.
What carries the argument
U-statistics and influence functions adapted for clustered data to compute DOOR-based treatment comparisons.
If this is right
- The methods enable DOOR analysis in cluster trials where individual randomization is not feasible.
- Performance is validated through simulations for different cluster sizes and numbers.
- Applicable to real-world examples like comparing medical procedures in a crossover trial.
Where Pith is reading between the lines
- These estimators could support patient-centric analyses in public health trials that must use cluster designs.
- Extensions might examine performance under varying strengths of intra-cluster correlation.
Load-bearing premise
That the properties of U-statistics and influence functions extend directly to clustered data without bias from intra-cluster correlation or unbalanced cluster sizes.
What would settle it
A simulation or real dataset where the new estimators show significant bias due to high intra-cluster correlation or varying cluster sizes would falsify the reliability of the methods.
Figures
read the original abstract
Cluster randomized trials are widely used when individual randomization is logistically infeasible or when correlations between observations cannot be ignored, especially in fields such as ophthalmology, infectious disease, vaccine research, and sociology. The desirability of outcome ranking (DOOR) framework evaluates patient-centric benefit-risk using an ordinal outcome and a Wilcoxon-Mann-Whitney statistic-based approach to compare outcome distributions between interventions. We propose a suite of new methods to extend DOOR to cluster trials based on properties of U-statistics and influence functions to estimate within-cluster and between-cluster treatment effects. These approaches can be applied in different scenarios, including mixtures of clusters with two treatment groups and clusters with only one group, and both small and large numbers of clusters. Simulations demonstrate that the proposed methods perform well under various scenarios regarding the number of clusters and cluster sizes. As an illustration, we apply the proposed methods to a cluster randomized crossover trial comparing delayed cord clamping and umbilical cord milking for newborns.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a suite of methods to extend the Desirability of Outcome Ranking (DOOR) framework to cluster randomized trials. It develops estimators for within-cluster and between-cluster treatment effects based on U-statistics and influence functions, applicable to mixtures of two-arm and single-arm clusters as well as small and large numbers of clusters. Performance is assessed via simulations across varying numbers of clusters and cluster sizes, with an illustration using data from a cluster randomized crossover trial on delayed cord clamping versus umbilical cord milking.
Significance. If the estimators prove consistent and the variance estimates robust, this would fill a practical gap by enabling patient-centric DOOR analyses in clustered designs common to infectious disease, ophthalmology, and social research. The grounding in U-statistic theory and the provision of both simulation results and a real-data example are strengths that could support broader adoption if the clustering adjustments are rigorously derived.
major comments (2)
- [Abstract] Abstract: The claim that standard U-statistic and influence-function machinery extends directly to produce valid estimators across mixed cluster types and unbalanced sizes lacks any statement of the required cluster-level regularity conditions or the explicit form of the cluster-robust projection of the kernel. Standard U-statistic theory assumes i.i.d. observations; without this projection the variance estimators are at risk of inconsistency precisely in the regimes (unbalanced sizes, non-negligible ICC, small numbers of clusters) highlighted as target scenarios.
- [Abstract] Abstract: The assertion that 'simulations demonstrate that the proposed methods perform well under various scenarios' supplies no details on simulation design (range of ICC values, cluster-size distributions, performance metrics such as bias or coverage), error-bar reporting, or explicit checks for bias induced by intra-cluster correlation. This information is load-bearing for evaluating whether the central methodological claim holds.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review of our manuscript. The comments highlight opportunities to strengthen the abstract's clarity on theoretical foundations and empirical support. We address each point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that standard U-statistic and influence-function machinery extends directly to produce valid estimators across mixed cluster types and unbalanced sizes lacks any statement of the required cluster-level regularity conditions or the explicit form of the cluster-robust projection of the kernel. Standard U-statistic theory assumes i.i.d. observations; without this projection the variance estimators are at risk of inconsistency precisely in the regimes (unbalanced sizes, non-negligible ICC, small numbers of clusters) highlighted as target scenarios.
Authors: We appreciate the referee drawing attention to this. The full manuscript defines the estimators via cluster-level U-statistics whose kernels are projected onto the cluster to induce the appropriate dependence structure, yielding cluster-robust influence functions that remain consistent under intra-cluster correlation, unbalanced cluster sizes, and mixtures of two-arm and single-arm clusters. Regularity conditions (finite moments of the outcome and suitable rates for the number of clusters) are stated in the theoretical development. We agree the abstract would be improved by a brief reference to the cluster-robust projection and the cluster-level regularity conditions; we will revise it accordingly. revision: yes
-
Referee: [Abstract] Abstract: The assertion that 'simulations demonstrate that the proposed methods perform well under various scenarios' supplies no details on simulation design (range of ICC values, cluster-size distributions, performance metrics such as bias or coverage), error-bar reporting, or explicit checks for bias induced by intra-cluster correlation. This information is load-bearing for evaluating whether the central methodological claim holds.
Authors: We agree that the abstract is concise and omits key simulation details. The manuscript contains a dedicated simulation section that systematically varies the number of clusters, cluster sizes (including unbalanced distributions), and intra-cluster correlation, evaluating bias, empirical standard errors, coverage of confidence intervals, and type-I error rates. We will revise the abstract to include a short summary of the simulation design and the performance metrics examined, thereby better supporting the reported findings. revision: yes
Circularity Check
No significant circularity; derivation extends established U-statistic theory without self-referential reduction
full rationale
The paper derives new within-cluster and between-cluster DOOR estimators for cluster randomized trials by extending standard U-statistic and influence-function properties to clustered data structures. The central claims rest on applying these established tools to handle mixtures of two-arm and single-arm clusters, with no evidence that any key quantity is defined in terms of itself, that a fitted parameter is relabeled as a prediction, or that load-bearing steps reduce to self-citations whose validity depends on the present work. Simulations and the real-data illustration supply independent checks. The derivation chain therefore remains self-contained against external statistical theory.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Properties of U-statistics and influence functions extend to within-cluster and between-cluster treatment effect estimation in DOOR endpoints
Reference graph
Works this paper leans on
-
[1]
An Evaluation of Weighted Chi-Square Statistics for Clustered Binary Data,
Ahn, C., Jung, S.-H., and Kang, S.-H. (2003), “An Evaluation of Weighted Chi-Square Statistics for Clustered Binary Data,”Drug Information Journal, 37, 91–99. Buyse, M. (2010), “Generalized Pairwise Comparisons of Prioritized Outcomes in the Two- Sample Problem,”Statistics in Medicine, 29, 3245–3257. Chamberlain, J. M., Kapur, J., Silbergleit, R. S., Elm,...
-
[2]
Sample Size Calculations for Clustered Binary Data,
Hunter, D. R. (2014),Notes for a Graduate-Level Course in Asymptotics for Statisticians, The Pennsylvania State University. Jennison, C. and Turnbull, B. W. (2000),Group Sequential Methods with Applications to Clinical Trials, Boca Raton, FL: Chapman & Hall/CRC. Jung, S.-H. (2024),Cluster Randomization Trials: Statistical Design and Analysis, New York: Ch...
work page 2014
-
[3]
Between-and within-cluster covariate effects in the analysis of clustered data,
Neuhaus, J. M. and Kalbfleisch, J. D. (1998), “Between-and within-cluster covariate effects in the analysis of clustered data,”Biometrics, 638–645. Pall, B., Gomes, P., Yi, F., and Torkildsen, G. (2019), “Management of ocular allergy itch with an antihistamine-releasing contact lens,”Cornea, 38, 713–717. Price, M. O., Feng, M. T., and Price Jr, F. W. (202...
work page 1998
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.