A Milestone-Based Framework for Characterizing Time-Varying Treatment Effects in Immunotherapy Trials
Pith reviewed 2026-05-08 02:20 UTC · model grok-4.3
The pith
A milestone-based framework characterizes time-varying treatment effects by separating long-term survival from short-term ordering in immunotherapy trials.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The milestone-based framework separates long-term survival beyond a clinically meaningful time point from earlier outcomes and provides a practical way to characterize patient heterogeneity in treatment response. The framework summarizes treatment differences through milestone survival probabilities and, among patients who do not reach the milestone, characterizes short-term treatment ordering over time using a tau-based summary that helps identify hazard reversal. It is illustrated using reconstructed individual-level data from three landmark phase III trials: CheckMate 067, CheckMate 227, and CLEAR.
What carries the argument
Milestone survival probabilities combined with a tau-based summary for short-term treatment ordering in the pre-milestone population.
Load-bearing premise
That a single clinically meaningful milestone time point can be chosen in advance and that the tau-based summary on the pre-milestone population adequately captures short-term ordering without further assumptions on the underlying hazard functions or censoring mechanisms.
What would settle it
A re-analysis of one of the example trials where the framework indicates no hazard reversal despite visual evidence from Kaplan-Meier curves showing crossing hazards, or inconsistent results from varying the milestone time.
Figures
read the original abstract
Immune checkpoint inhibitor--based therapies often produce heterogeneous survival responses, including early risk, delayed treatment benefit, and durable long-term survival in a subset of patients. In these settings, conventional summary measures such as the hazard ratio may not adequately describe how treatment effects evolve over follow-up. We propose a milestone-based framework that separates long-term survival beyond a clinically meaningful time point from earlier outcomes and provides a practical way to characterize patient heterogeneity in treatment response. The framework summarizes treatment differences through milestone survival probabilities and, among patients who do not reach the milestone, characterizes short-term treatment ordering over time using a tau-based summary that helps identify hazard reversal. We illustrate the approach using reconstructed individual-level data from three landmark phase III trials: CheckMate~067, CheckMate~227, and CLEAR. Across these examples, the framework captures patterns that are difficult to summarize with conventional measures, including settings in which early disadvantage coexists with later durable benefit. It also helps clarify when treatment benefit begins to emerge and how short-term and long-term effects differ within the same trial. This approach provides a clinically interpretable and statistically principled way to evaluate heterogeneous and time-varying treatment effects in oncology trials with nonproportional hazards.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a milestone-based framework for characterizing time-varying treatment effects in immunotherapy trials. It separates long-term survival beyond a pre-specified clinically meaningful milestone time from short-term outcomes using milestone survival probabilities. For patients not reaching the milestone, it employs a tau-based summary to order treatments over time and identify potential hazard reversals. The framework is demonstrated on reconstructed individual-level data from three phase III trials: CheckMate 067, CheckMate 227, and CLEAR, highlighting patterns such as early risk with later durable benefit that are not captured by conventional hazard ratios.
Significance. If the technical details of the tau-based summary are rigorously justified and the method proves robust to censoring and choice of milestone, this framework could provide a valuable, interpretable alternative to standard summary measures for non-proportional hazards in oncology, aiding clinical interpretation of heterogeneous responses in immunotherapy.
major comments (4)
- [Methods] Methods section (description of tau-based summary): The definition, derivation, and statistical properties of the tau-based summary for short-term treatment ordering on the pre-milestone subpopulation are not provided; it is unclear whether this functional correctly orders treatments or identifies hazard reversal without additional assumptions on the underlying hazard processes or independent censoring.
- [Methods] Methods and Application sections (milestone time point): The framework relies on pre-specifying a single clinically meaningful milestone t* such that P(T > t*) cleanly separates long-term from short-term effects, but no justification, sensitivity analysis to modest shifts in t*, or discussion of information loss is given; this choice is load-bearing for the central separation claim.
- [Results] Results section: No simulation studies are included to evaluate the framework's performance under controlled scenarios with known time-varying hazards, varying censoring rates, or competing risks, leaving the reliability of the milestone survival probabilities and tau-summary unverified despite the use of reconstructed KM data from the three trials.
- [Application] Application section (data handling): The analysis uses reconstructed individual-level data from published Kaplan-Meier curves in CheckMate 067, 227, and CLEAR without explicit discussion of how right-censoring or potential reconstruction inaccuracies are incorporated into the tau-based calculations or milestone probabilities.
minor comments (2)
- [Abstract] Abstract: The phrase 'tau-based summary' is introduced without a brief inline definition or pointer to its exact formula in the main text.
- [Figures] Figure captions: Ensure all figures explicitly label the specific milestone times t* used in each trial example for clarity.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our manuscript. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Methods] Methods section (description of tau-based summary): The definition, derivation, and statistical properties of the tau-based summary for short-term treatment ordering on the pre-milestone subpopulation are not provided; it is unclear whether this functional correctly orders treatments or identifies hazard reversal without additional assumptions on the underlying hazard processes or independent censoring.
Authors: We appreciate this observation. The tau-based summary is intended as a rank-based measure (analogous to Kendall's tau) that compares the ordering of survival times between treatment arms within the pre-milestone group to detect crossings or reversals in the cumulative incidence. In the revised manuscript, we will add a formal definition, derivation from the joint survival distribution, and discussion of its properties, including the assumption of independent censoring and how it identifies hazard reversals when the integrated difference in hazards changes sign. This will clarify its use without requiring proportional hazards. revision: yes
-
Referee: [Methods] Methods and Application sections (milestone time point): The framework relies on pre-specifying a single clinically meaningful milestone t* such that P(T > t*) cleanly separates long-term from short-term effects, but no justification, sensitivity analysis to modest shifts in t*, or discussion of information loss is given; this choice is load-bearing for the central separation claim.
Authors: We agree that the choice of t* is critical. In the revision, we will justify the choices of t* for each trial based on clinical literature and trial characteristics (e.g., 12 or 24 months for immunotherapy benefit emergence). We will also include sensitivity analyses showing how the milestone survival probabilities and tau summaries change with small perturbations in t* (such as ±2 or ±3 months) and discuss the trade-off in information loss versus interpretability. These additions will be made to the Methods and Results sections. revision: yes
-
Referee: [Results] Results section: No simulation studies are included to evaluate the framework's performance under controlled scenarios with known time-varying hazards, varying censoring rates, or competing risks, leaving the reliability of the milestone survival probabilities and tau-summary unverified despite the use of reconstructed KM data from the three trials.
Authors: The primary aim of the manuscript is to introduce the framework and illustrate its application to real trial data to demonstrate clinical interpretability. However, we recognize the importance of simulation-based validation. In the revised version, we will incorporate a simulation study section that assesses the framework under scenarios with non-proportional hazards, different censoring mechanisms, and sample sizes, to verify the accuracy of the estimates and the tau-summary's ability to detect reversals. revision: yes
-
Referee: [Application] Application section (data handling): The analysis uses reconstructed individual-level data from published Kaplan-Meier curves in CheckMate 067, 227, and CLEAR without explicit discussion of how right-censoring or potential reconstruction inaccuracies are incorporated into the tau-based calculations or milestone probabilities.
Authors: We will revise the Application section to include a detailed description of the data reconstruction method (using the algorithm of Guyot et al. or similar), how the reconstructed data preserve the original censoring information from the KM curves, and the potential limitations due to reconstruction error. We will also explain that the milestone probabilities are directly estimated from the KM curves, and the tau-summary is computed on the reconstructed event times, with bootstrap for inference to account for variability. revision: yes
Circularity Check
Milestone framework is a purely descriptive summary tool with no self-referential derivation
full rationale
The paper presents a milestone-based framework as a descriptive statistical tool that separates long-term survival probabilities from short-term tau-based ordering on the pre-milestone subpopulation. No equations, predictions, or first-principles results are claimed that reduce by construction to fitted parameters, self-citations, or ansatzes imported from prior work. The approach is illustrated on reconstructed trial data without any load-bearing steps that equate outputs to inputs. This is a standard non-circular descriptive method.
Axiom & Free-Parameter Ledger
free parameters (2)
- milestone time point
- tau threshold or definition
axioms (2)
- domain assumption Standard right-censoring assumptions hold and do not distort the pre-milestone ordering.
- ad hoc to paper A single milestone time can separate short-term from long-term effects without loss of important information.
Reference graph
Works this paper leans on
-
[1]
doi: 10.1056/NEJMoa1910231. 20 Luis G. Paz-Ares, Suresh S. Ramalingam, Tudor-Eliade Ciuleanu, Jong-Seok Lee, Laszlo Urban, Reyes Bernab´ e Caro, Keunchil Park, Hiroshi Sakai, Yuichiro Ohe, Makoto Nishio, Clarisse Audigier-Valette, Jacobus A. Burgers, Adam Pluzanski, Randeep Sangha, Carlos Gallardo, Masayuki Takeda, Helena Linardou, Lorena Lupinacci, Ki Hy...
-
[2]
URLhttps://doi.org/10.1200/JCO.23.01569
doi: 10.1200/JCO.23.01569. URLhttps://doi.org/10.1200/JCO.23.01569. 21
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.