Navigating the Landscape of Hierarchical Multi-Component Strategies: GPC, DOOR, and MOST

Frank E. Harrell Jr; Johan Verbeeck; Marc Buyse; Marc Vandemeulebroecke; Micka\"el De Backer; Scott Evans; Toshimitsu Hamasaki; Vivian Lanius

arxiv: 2604.12662 · v1 · submitted 2026-04-14 · 📊 stat.ME

Navigating the Landscape of Hierarchical Multi-Component Strategies: GPC, DOOR, and MOST

Micka\"el De Backer , Johan Verbeeck , Vivian Lanius , Marc Vandemeulebroecke , Scott Evans , Toshimitsu Hamasaki , Marc Buyse , Frank E. Harrell Jr This is my paper

Pith reviewed 2026-05-10 14:41 UTC · model grok-4.3

classification 📊 stat.ME

keywords doorhierarchicalmulti-componentpatientsstatisticalapproachescomprehensiveemerged

0 comments

The pith

A comparative review of GPC, DOOR, and MOST for hierarchical multi-component statistical methods in drug development.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Drug development increasingly considers multiple outcomes that reflect patients' real experiences rather than single measures. This paper examines three approaches that handle these outcomes in a prioritized, hierarchical way. Generalized pairwise comparisons evaluate treatments by comparing pairs of patients across outcomes. Desirability of outcome ranking assigns desirability levels to possible outcome combinations. The Markov ordinal state transition model tracks how patients move between ordered health states over time. The authors explore similarities and differences in how these methods structure priorities and interpret treatment effects, aiming to guide users in selecting or combining them.

Core claim

This paper seeks to fill this gap by offering a comprehensive and comparative analysis of the three approaches. Through examples and an exploration of the structural and philosophical differences between the methods, our aim is to provide guidance and encourage lines of research in the rapidly-evolving landscape of hierarchical multi-component statistical methodologies.

Load-bearing premise

That the three methods are sufficiently distinct in structure and philosophy to yield actionable comparative insights without the analysis itself introducing selection bias or overlooking implementation details from the original papers.

read the original abstract

There is a growing recognition of the importance to involve patients in every stage of drug development. This shift acknowledges that patients' perspectives, experiences, and preferences are essential for ensuring that treatments meet real-world needs. In this context, a new body of statistical literature has emerged, focusing not only on the simultaneous consideration of multiple outcomes that reflect patients' overall experiences, but also on their structured prioritization. We refer to this class of approaches as hierarchical multi-component statistical methods. Among these, two influential frameworks - generalized pairwise comparisons (GPC) and desirability of outcome ranking (DOOR) - have emerged in the last decade, each aiming to offer a comprehensive approach to evaluating treatment effects. A new methodology, referred to here as the Markov ordinal state transition model (MOST), has recently been introduced without focusing on an explicit link with GPC nor DOOR. This paper seeks to fill this gap by offering a comprehensive and comparative analysis of the three approaches. Through examples and an exploration of the structural and philosophical differences between the methods, our aim is to provide guidance and encourage lines of research in the rapidly-evolving landscape of hierarchical multi-component statistical methodologies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript offers a comparative analysis of three hierarchical multi-component statistical methods for assessing treatment effects with prioritized multiple outcomes: generalized pairwise comparisons (GPC), desirability of outcome ranking (DOOR), and the Markov ordinal state transition model (MOST). It uses illustrative examples and discusses structural and philosophical differences among the approaches, with the goal of providing guidance to researchers and stimulating further work in patient-centered statistical methodologies.

Significance. If the comparisons prove accurate and balanced, the paper could help clarify relationships among these methods and support more informed choices in clinical trial analysis involving multi-outcome data. It usefully positions the newer MOST framework relative to the established GPC and DOOR approaches and underscores the shift toward incorporating patient priorities. The exploratory framing limits immediate impact, however, as the guidance remains qualitative rather than anchored in standardized benchmarks.

major comments (2)

Abstract and Introduction: The central claim that the analysis will 'provide guidance' rests on an exploration of structural and philosophical differences, yet no pre-specified objective criteria (such as formal equivalence mappings, standardized performance metrics on identical data-generating processes, or exhaustive edge-case enumeration) are defined to structure the comparison. This leaves the derived insights vulnerable to selection of examples and emphasis, directly affecting the actionability asserted in the abstract.
Section on examples and comparisons: Separate illustrative examples are presented for each method rather than side-by-side evaluations on shared data sets with identical hierarchical outcome structures. Without such controlled contrasts, differences in conclusions or sensitivity to prioritization choices cannot be quantified, weakening the ability to offer concrete recommendations on when one approach may be preferable.

minor comments (2)

Ensure consistent notation for outcome prioritization hierarchies across all three methods to facilitate direct comparison; current usage occasionally shifts between 'ranking' and 'transition' terminology without explicit cross-referencing.
Add a table summarizing key structural features (e.g., handling of ties, computational complexity, assumptions on outcome dependence) to improve clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We have revised the manuscript to address the concerns about the basis for our guidance and the structure of the examples, while preserving the exploratory and conceptual focus of the work.

read point-by-point responses

Referee: Abstract and Introduction: The central claim that the analysis will 'provide guidance' rests on an exploration of structural and philosophical differences, yet no pre-specified objective criteria (such as formal equivalence mappings, standardized performance metrics on identical data-generating processes, or exhaustive edge-case enumeration) are defined to structure the comparison. This leaves the derived insights vulnerable to selection of examples and emphasis, directly affecting the actionability asserted in the abstract.

Authors: We agree that the manuscript does not define pre-specified quantitative criteria or conduct a formal benchmark study, as our goal is a qualitative exploration of structural and philosophical differences rather than a simulation-based comparison. This exploratory framing is stated explicitly in the abstract and introduction. To mitigate concerns about selection bias and actionability, we have revised the abstract to replace 'provide guidance' with 'offer insights into the relationships among these methods and stimulate further research,' and we have added a dedicated limitations paragraph in the Discussion that acknowledges the illustrative nature of the examples and recommends future work on standardized metrics and shared data-generating processes. revision: partial
Referee: Section on examples and comparisons: Separate illustrative examples are presented for each method rather than side-by-side evaluations on shared data sets with identical hierarchical outcome structures. Without such controlled contrasts, differences in conclusions or sensitivity to prioritization choices cannot be quantified, weakening the ability to offer concrete recommendations on when one approach may be preferable.

Authors: The referee is correct that the examples are presented separately. This choice was made to highlight the distinct features, assumptions, and typical use cases of each method in their original contexts without imposing an artificial common data structure that might obscure philosophical differences. However, we recognize that this limits direct quantification of differences. We have therefore added a new subsection titled 'Toward Controlled Comparisons' that outlines a framework for designing side-by-side evaluations on shared hierarchical outcome structures and includes a brief hypothetical numerical illustration demonstrating how prioritization choices can affect conclusions under each approach. This addition provides readers with a concrete starting point for such analyses while keeping the manuscript's primary focus on conceptual navigation. revision: partial

Circularity Check

0 steps flagged

No circularity in exploratory comparative analysis

full rationale

The paper performs a qualitative comparison of three established or recently introduced hierarchical multi-component methods (GPC, DOOR, MOST) via examples and discussion of structural/philosophical differences. No mathematical derivation chain, first-principles predictions, fitted parameters, or uniqueness theorems are presented that could reduce to inputs by construction. References to prior frameworks serve as background rather than load-bearing self-citations for new results. The analysis is self-contained as a review without any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review paper based on abstract only. No free parameters, axioms, or invented entities are introduced; the work relies on prior definitions of GPC, DOOR, and MOST from the literature.

pith-pipeline@v0.9.0 · 5536 in / 1000 out tokens · 40583 ms · 2026-05-10T14:41:44.417249+00:00 · methodology

Navigating the Landscape of Hierarchical Multi-Component Strategies: GPC, DOOR, and MOST

Core claim

Load-bearing premise

discussion (0)