pith. sign in

arxiv: 2605.04999 · v2 · submitted 2026-05-06 · 📊 stat.ME

A Tutorial for Evaluating Cure Model Appropriateness

Pith reviewed 2026-05-14 21:50 UTC · model grok-4.3

classification 📊 stat.ME
keywords cure modelssurvival analysisKaplan-Meier curvesidentifiabilityclinical trialsacute myeloid leukemiahematopoietic cell transplantation
0
0 comments X

The pith

Cure models in survival analysis need systematic checks with clinical judgment and Kaplan-Meier inspection before fitting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents a tutorial for deciding when cure models are appropriate in survival analysis. Traditional models assume everyone experiences the event, but in cases with curative therapies some individuals may never experience it. The authors describe a procedure combining clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative evaluation to confirm identifiability conditions. They illustrate it with data from an acute myeloid leukemia trial and other hematopoietic cell transplantation datasets. Following this approach allows for more reliable estimates and better clinical decisions.

Core claim

The paper claims that a systematic procedure integrating clinical judgment, Kaplan-Meier curve inspection, and quantitative evaluation can determine whether cure models are appropriate for a given dataset, as shown in the worked example from a randomized clinical trial in acute myeloid leukemia and summarized findings from hematopoietic cell transplantation datasets.

What carries the argument

A systematic evaluation procedure that combines clinical judgment with visual inspection of Kaplan-Meier curves and quantitative metrics to assess cure model identifiability.

If this is right

  • Researchers avoid inappropriate application of cure models that leads to invalid estimates.
  • Survival analysis in curative therapy contexts becomes more reliable.
  • Clinical decision-making improves based on valid model parameters.
  • Practical guidance emerges for datasets like those in hematopoietic cell transplantation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be adapted to other medical fields with potential long-term survivors.
  • It highlights the need for identifiability checks in other advanced statistical models.
  • Standardizing this evaluation might reduce misuse of cure models in published research.

Load-bearing premise

Clinical judgment combined with Kaplan-Meier visual inspection and quantitative evaluation can reliably confirm the identifiability conditions for valid cure model estimation across diverse datasets.

What would settle it

A dataset where the proposed evaluation procedure approves cure modeling but subsequent fitting reveals non-identifiable parameters or unstable estimates.

read the original abstract

In survival analysis, traditional models assume all individuals will eventually experience the event of interest. However, advances in therapeutics have led to multiple clinical contexts with potentially curative therapies, and in these contexts, certain individuals may never experience the event. Statisticians have developed cure models as a methodology to address this challenge. Nonetheless, despite significant statistical advances in cure models, we have seen more limited uptake in biomedical applications, and we hypothesize that this is caused by limited guidance in the appropriate application of cure models. Cure models require specific identifiability conditions for valid parameter estimation, and previous reports have demonstrated significant issues with the inappropriate application of cure models. Existing tutorials for cure models focus on model implementation and either assume or provide only limited guidance on whether cure modeling is appropriate for the given dataset. This tutorial addresses this gap by describing a systematic procedure that integrates clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative evaluation. We provide a worked example using data from a randomized clinical trial in acute myeloid leukemia, and we also summarize findings from a series of other datasets of hematopoietic cell transplantation to suggest broad practical guidance for choosing to apply cure models. By systematically evaluating cure model appropriateness before fitting these models, researchers can achieve more reliable survival analysis and improved clinical decision-making.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents a tutorial for determining when cure models are appropriate in survival analysis. It describes a systematic procedure that combines clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative evaluation to check identifiability conditions such as sufficient follow-up and positive cure fraction. The authors illustrate this with a worked example from an acute myeloid leukemia clinical trial and provide summaries from hematopoietic cell transplantation datasets, concluding that this approach leads to more reliable survival analyses and better clinical decisions.

Significance. If validated, the tutorial could significantly improve the application of cure models in biomedical research by offering practical guidance that has been lacking, potentially reducing inappropriate uses and enhancing the reliability of survival estimates in contexts with curative therapies.

major comments (2)
  1. [AML worked example] The central claim relies on the procedure's ability to confirm identifiability, but the worked AML example and HCT summaries do not include any empirical validation such as simulations showing the procedure's sensitivity and specificity in detecting identifiability failures (e.g., short follow-up or near-zero cure fraction). This undermines the assertion of achieving more reliable survival analysis.
  2. [HCT datasets summary] The broad practical guidance suggested from the series of HCT datasets would benefit from explicit reporting of the specific quantitative criteria and thresholds used to evaluate appropriateness, as these are not detailed in the provided summaries.
minor comments (1)
  1. [Abstract] Consider adding a brief mention of the specific quantitative evaluation methods employed to enhance clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for their detailed review and constructive comments on our tutorial manuscript. We address each major comment below and have made revisions where appropriate to strengthen the presentation of the evaluation procedure.

read point-by-point responses
  1. Referee: [AML worked example] The central claim relies on the procedure's ability to confirm identifiability, but the worked AML example and HCT summaries do not include any empirical validation such as simulations showing the procedure's sensitivity and specificity in detecting identifiability failures (e.g., short follow-up or near-zero cure fraction). This undermines the assertion of achieving more reliable survival analysis.

    Authors: We thank the referee for highlighting this important consideration. The tutorial is designed to guide practitioners in applying established identifiability conditions from the cure model literature, including sufficient follow-up time to observe a plateau in the survival curve and evidence of a non-zero cure fraction. These conditions are necessary for reliable estimation, as documented in prior methodological work. While we acknowledge that simulation studies demonstrating the sensitivity and specificity of our proposed procedure would provide valuable additional support, such analyses fall outside the scope of this tutorial, which focuses on practical application with real-world clinical data. We will revise the manuscript to include a dedicated limitations section that discusses the heuristic nature of the procedure and emphasizes that it is intended to complement, rather than replace, statistical expertise and further validation. revision: partial

  2. Referee: [HCT datasets summary] The broad practical guidance suggested from the series of HCT datasets would benefit from explicit reporting of the specific quantitative criteria and thresholds used to evaluate appropriateness, as these are not detailed in the provided summaries.

    Authors: We agree with this suggestion and appreciate the opportunity to improve the clarity of our presentation. In the revised manuscript, we will explicitly report the quantitative criteria and thresholds applied to the HCT datasets, including specific values for follow-up duration, observed plateau levels in the Kaplan-Meier estimates, and estimated cure fractions used to determine appropriateness. revision: yes

Circularity Check

0 steps flagged

No circularity: tutorial offers procedural checklist without derivations or self-referential fits

full rationale

The manuscript is a tutorial describing a three-part evaluation procedure (clinical judgment + Kaplan-Meier visual inspection + quantitative checks) for deciding when cure models are appropriate. It supplies a worked AML example and summaries from HCT datasets but contains no equations, fitted parameters, predictions, or uniqueness theorems. No step reduces an output to an input by construction, no self-citation is invoked to justify a load-bearing premise, and no ansatz or renaming of known results occurs. The central claim is therefore self-contained as guidance rather than a mathematical derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The tutorial rests on standard survival analysis assumptions and the domain requirement that cure models need specific identifiability conditions; no free parameters or new entities are introduced.

axioms (1)
  • domain assumption Cure models require specific identifiability conditions for valid parameter estimation.
    Explicitly stated in the abstract as a prerequisite for proper use of cure models.

pith-pipeline@v0.9.0 · 5532 in / 1164 out tokens · 48841 ms · 2026-05-14T21:50:41.209066+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

  1. [1]

    Machine learning-based cure model in engineering reliability

    Pal S, Aselisewine W. Machine learning-based cure model in engineering reliability. In: Developments in Reliability Engineering. Elsevier; 2024:501-21. 12. Amico M, Van Keilegom I, Han B. Assessing cure status prediction using ROC curves. Biometrika. 2020;108(3):727-40. 13. Hanin L, Huang L-S. Identifiability of cure models revisited. J Multivar Anal. 201...

  2. [2]

    Testing for sufficient follow-up and outliers in survival data

    Maller RA, Zhou S. Testing for sufficient follow-up and outliers in survival data. JASA. 1994;89(428):1499-506. 25. Maller RA, Zhou S. Testing for the presence of immune or cured individuals. Biometrics. 1995;51:1197-205. 26. Peng Y, Yu B. Cure models: Methods, applications and implementation. CRC Press; 2021. 27. Shen P-s. Testing for sufficient follow-u...

  3. [3]

    Testing for sufficient follow-up in survival data with a cure fraction

    Yuen TP, Musta E. Testing for sufficient follow-up in survival data with a cure fraction. arXiv:2403.16832. 2024. 37. Kouadio C, Selukar S, Othus M, Chevret S. Detecting cure model appropriateness in randomized clinical trials with long-term survivors. JCO Clin Cancer Inform. 2025;9:e2500084. 38. Escobar-Bach M, Maller R, Van Keilegom I, Zhao M. Estimatio...