A Tutorial for Evaluating Cure Model Appropriateness

A Tutorial for Evaluating Cure Model Appropriateness Geethanjalee Mudunkotuwa; Durbadal Ghosh; Subodh Selukar

arxiv: 2605.04999 · v2 · submitted 2026-05-06 · 📊 stat.ME

A Tutorial for Evaluating Cure Model Appropriateness

A Tutorial for Evaluating Cure Model Appropriateness Geethanjalee Mudunkotuwa , Durbadal Ghosh , Subodh Selukar This is my paper

Pith reviewed 2026-05-14 21:50 UTC · model grok-4.3

classification 📊 stat.ME

keywords cure modelssurvival analysisKaplan-Meier curvesidentifiabilityclinical trialsacute myeloid leukemiahematopoietic cell transplantation

0 comments

The pith

Cure models in survival analysis need systematic checks with clinical judgment and Kaplan-Meier inspection before fitting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents a tutorial for deciding when cure models are appropriate in survival analysis. Traditional models assume everyone experiences the event, but in cases with curative therapies some individuals may never experience it. The authors describe a procedure combining clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative evaluation to confirm identifiability conditions. They illustrate it with data from an acute myeloid leukemia trial and other hematopoietic cell transplantation datasets. Following this approach allows for more reliable estimates and better clinical decisions.

Core claim

The paper claims that a systematic procedure integrating clinical judgment, Kaplan-Meier curve inspection, and quantitative evaluation can determine whether cure models are appropriate for a given dataset, as shown in the worked example from a randomized clinical trial in acute myeloid leukemia and summarized findings from hematopoietic cell transplantation datasets.

What carries the argument

A systematic evaluation procedure that combines clinical judgment with visual inspection of Kaplan-Meier curves and quantitative metrics to assess cure model identifiability.

If this is right

Researchers avoid inappropriate application of cure models that leads to invalid estimates.
Survival analysis in curative therapy contexts becomes more reliable.
Clinical decision-making improves based on valid model parameters.
Practical guidance emerges for datasets like those in hematopoietic cell transplantation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be adapted to other medical fields with potential long-term survivors.
It highlights the need for identifiability checks in other advanced statistical models.
Standardizing this evaluation might reduce misuse of cure models in published research.

Load-bearing premise

Clinical judgment combined with Kaplan-Meier visual inspection and quantitative evaluation can reliably confirm the identifiability conditions for valid cure model estimation across diverse datasets.

What would settle it

A dataset where the proposed evaluation procedure approves cure modeling but subsequent fitting reveals non-identifiable parameters or unstable estimates.

read the original abstract

In survival analysis, traditional models assume all individuals will eventually experience the event of interest. However, advances in therapeutics have led to multiple clinical contexts with potentially curative therapies, and in these contexts, certain individuals may never experience the event. Statisticians have developed cure models as a methodology to address this challenge. Nonetheless, despite significant statistical advances in cure models, we have seen more limited uptake in biomedical applications, and we hypothesize that this is caused by limited guidance in the appropriate application of cure models. Cure models require specific identifiability conditions for valid parameter estimation, and previous reports have demonstrated significant issues with the inappropriate application of cure models. Existing tutorials for cure models focus on model implementation and either assume or provide only limited guidance on whether cure modeling is appropriate for the given dataset. This tutorial addresses this gap by describing a systematic procedure that integrates clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative evaluation. We provide a worked example using data from a randomized clinical trial in acute myeloid leukemia, and we also summarize findings from a series of other datasets of hematopoietic cell transplantation to suggest broad practical guidance for choosing to apply cure models. By systematically evaluating cure model appropriateness before fitting these models, researchers can achieve more reliable survival analysis and improved clinical decision-making.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This tutorial gives a practical integrated checklist for checking cure model fit using clinical input, KM plots, and quant checks, with real examples, but offers no test of whether the checklist actually catches identifiability problems.

read the letter

The paper's core offering is a three-part procedure—clinical judgment, visual inspection of the Kaplan-Meier curve for a plateau, and quantitative checks—to decide if a cure model is appropriate before fitting. They walk through it on a randomized AML trial and then summarize patterns across several hematopoietic cell transplant datasets. That fills a real gap, since most cure model tutorials skip straight to implementation and give little help on when the model is even identifiable. The examples are concrete and the cross-dataset summary gives applied readers a usable sense of when cure fractions show up in practice. The logic of the steps follows established identifiability conditions without obvious internal contradictions. The soft spot is the missing validation. The manuscript presents the checklist as improving reliability but supplies no simulation study, sensitivity runs, or cross-validation where they plant known failures (short follow-up, near-zero cure fraction) and check whether the procedure flags them. Without that, the move from “use these steps” to “more reliable analysis” stays plausible rather than demonstrated. The thresholds for visual and quantitative signals are not stress-tested across realistic data regimes. This is for applied statisticians and clinical researchers who already know basic survival analysis and want to try cure models on their data. It is not aimed at method developers. The thinking is straightforward and cites the relevant literature on identifiability. I would send it to peer review because the guidance is needed and the examples are solid; referees could reasonably ask for added validation or clearer quantitative cutoffs.

Referee Report

2 major / 1 minor

Summary. The paper presents a tutorial for determining when cure models are appropriate in survival analysis. It describes a systematic procedure that combines clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative evaluation to check identifiability conditions such as sufficient follow-up and positive cure fraction. The authors illustrate this with a worked example from an acute myeloid leukemia clinical trial and provide summaries from hematopoietic cell transplantation datasets, concluding that this approach leads to more reliable survival analyses and better clinical decisions.

Significance. If validated, the tutorial could significantly improve the application of cure models in biomedical research by offering practical guidance that has been lacking, potentially reducing inappropriate uses and enhancing the reliability of survival estimates in contexts with curative therapies.

major comments (2)

[AML worked example] The central claim relies on the procedure's ability to confirm identifiability, but the worked AML example and HCT summaries do not include any empirical validation such as simulations showing the procedure's sensitivity and specificity in detecting identifiability failures (e.g., short follow-up or near-zero cure fraction). This undermines the assertion of achieving more reliable survival analysis.
[HCT datasets summary] The broad practical guidance suggested from the series of HCT datasets would benefit from explicit reporting of the specific quantitative criteria and thresholds used to evaluate appropriateness, as these are not detailed in the provided summaries.

minor comments (1)

[Abstract] Consider adding a brief mention of the specific quantitative evaluation methods employed to enhance clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for their detailed review and constructive comments on our tutorial manuscript. We address each major comment below and have made revisions where appropriate to strengthen the presentation of the evaluation procedure.

read point-by-point responses

Referee: [AML worked example] The central claim relies on the procedure's ability to confirm identifiability, but the worked AML example and HCT summaries do not include any empirical validation such as simulations showing the procedure's sensitivity and specificity in detecting identifiability failures (e.g., short follow-up or near-zero cure fraction). This undermines the assertion of achieving more reliable survival analysis.

Authors: We thank the referee for highlighting this important consideration. The tutorial is designed to guide practitioners in applying established identifiability conditions from the cure model literature, including sufficient follow-up time to observe a plateau in the survival curve and evidence of a non-zero cure fraction. These conditions are necessary for reliable estimation, as documented in prior methodological work. While we acknowledge that simulation studies demonstrating the sensitivity and specificity of our proposed procedure would provide valuable additional support, such analyses fall outside the scope of this tutorial, which focuses on practical application with real-world clinical data. We will revise the manuscript to include a dedicated limitations section that discusses the heuristic nature of the procedure and emphasizes that it is intended to complement, rather than replace, statistical expertise and further validation. revision: partial
Referee: [HCT datasets summary] The broad practical guidance suggested from the series of HCT datasets would benefit from explicit reporting of the specific quantitative criteria and thresholds used to evaluate appropriateness, as these are not detailed in the provided summaries.

Authors: We agree with this suggestion and appreciate the opportunity to improve the clarity of our presentation. In the revised manuscript, we will explicitly report the quantitative criteria and thresholds applied to the HCT datasets, including specific values for follow-up duration, observed plateau levels in the Kaplan-Meier estimates, and estimated cure fractions used to determine appropriateness. revision: yes

Circularity Check

0 steps flagged

No circularity: tutorial offers procedural checklist without derivations or self-referential fits

full rationale

The manuscript is a tutorial describing a three-part evaluation procedure (clinical judgment + Kaplan-Meier visual inspection + quantitative checks) for deciding when cure models are appropriate. It supplies a worked AML example and summaries from HCT datasets but contains no equations, fitted parameters, predictions, or uniqueness theorems. No step reduces an output to an input by construction, no self-citation is invoked to justify a load-bearing premise, and no ansatz or renaming of known results occurs. The central claim is therefore self-contained as guidance rather than a mathematical derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The tutorial rests on standard survival analysis assumptions and the domain requirement that cure models need specific identifiability conditions; no free parameters or new entities are introduced.

axioms (1)

domain assumption Cure models require specific identifiability conditions for valid parameter estimation.
Explicitly stated in the abstract as a prerequisite for proper use of cure models.

pith-pipeline@v0.9.0 · 5532 in / 1164 out tokens · 48841 ms · 2026-05-14T21:50:41.209066+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

Machine learning-based cure model in engineering reliability

Pal S, Aselisewine W. Machine learning-based cure model in engineering reliability. In: Developments in Reliability Engineering. Elsevier; 2024:501-21. 12. Amico M, Van Keilegom I, Han B. Assessing cure status prediction using ROC curves. Biometrika. 2020;108(3):727-40. 13. Hanin L, Huang L-S. Identifiability of cure models revisited. J Multivar Anal. 201...

work page 2024
[2]

Testing for sufficient follow-up and outliers in survival data

Maller RA, Zhou S. Testing for sufficient follow-up and outliers in survival data. JASA. 1994;89(428):1499-506. 25. Maller RA, Zhou S. Testing for the presence of immune or cured individuals. Biometrics. 1995;51:1197-205. 26. Peng Y, Yu B. Cure models: Methods, applications and implementation. CRC Press; 2021. 27. Shen P-s. Testing for sufficient follow-u...

work page doi:10.1007/s40273-019-00867-5 1994
[3]

Testing for sufficient follow-up in survival data with a cure fraction

Yuen TP, Musta E. Testing for sufficient follow-up in survival data with a cure fraction. arXiv:2403.16832. 2024. 37. Kouadio C, Selukar S, Othus M, Chevret S. Detecting cure model appropriateness in randomized clinical trials with long-term survivors. JCO Clin Cancer Inform. 2025;9:e2500084. 38. Escobar-Bach M, Maller R, Van Keilegom I, Zhao M. Estimatio...

work page arXiv 2024

[1] [1]

Machine learning-based cure model in engineering reliability

Pal S, Aselisewine W. Machine learning-based cure model in engineering reliability. In: Developments in Reliability Engineering. Elsevier; 2024:501-21. 12. Amico M, Van Keilegom I, Han B. Assessing cure status prediction using ROC curves. Biometrika. 2020;108(3):727-40. 13. Hanin L, Huang L-S. Identifiability of cure models revisited. J Multivar Anal. 201...

work page 2024

[2] [2]

Testing for sufficient follow-up and outliers in survival data

Maller RA, Zhou S. Testing for sufficient follow-up and outliers in survival data. JASA. 1994;89(428):1499-506. 25. Maller RA, Zhou S. Testing for the presence of immune or cured individuals. Biometrics. 1995;51:1197-205. 26. Peng Y, Yu B. Cure models: Methods, applications and implementation. CRC Press; 2021. 27. Shen P-s. Testing for sufficient follow-u...

work page doi:10.1007/s40273-019-00867-5 1994

[3] [3]

Testing for sufficient follow-up in survival data with a cure fraction

Yuen TP, Musta E. Testing for sufficient follow-up in survival data with a cure fraction. arXiv:2403.16832. 2024. 37. Kouadio C, Selukar S, Othus M, Chevret S. Detecting cure model appropriateness in randomized clinical trials with long-term survivors. JCO Clin Cancer Inform. 2025;9:e2500084. 38. Escobar-Bach M, Maller R, Van Keilegom I, Zhao M. Estimatio...

work page arXiv 2024