J- and MJ-Type Tests for Non-Nested Parametric Survival Models with a Cure Fraction: A Score Test Approach
Pith reviewed 2026-07-03 19:23 UTC · model grok-4.3
The pith
Score tests on augmented log-likelihoods discriminate non-nested survival models with cure fractions using only null-model estimates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The score statistic for two models reduces to a quadratic form in the sample mean of the individual log-likelihood differences; its signed square root coincides with Vuong's statistic, yet the test targets the specific null that a given model is the true data-generating process, employs an unsigned form that extends directly to M greater than or equal to 2 models, and estimates the Kullback-Leibler bias term by parametric bootstrap. The MJ statistic then combines the individual J tests to assess the global null that at least one candidate model is correctly specified while also serving as a model-selection criterion.
What carries the argument
The MJ statistic formed by combining individual J score tests applied to the null log-likelihood after augmentation with information from each competing model.
If this is right
- The procedure requires only restricted maximum likelihood estimates from the null model.
- For any pair of models the statistic is a quadratic form in the average log-likelihood difference.
- The MJ statistic tests the global hypothesis that at least one of the candidate models is correctly specified.
- The same statistic supplies a numerical criterion for selecting among the models.
Where Pith is reading between the lines
- The bootstrap estimation of the bias term may improve finite-sample behavior compared with purely asymptotic approximations.
- Analogous score-based tests could be constructed for non-nested models that differ in more than just the baseline hazard.
- The global-null property of the MJ statistic may be useful in settings where several plausible cure-fraction specifications are under consideration simultaneously.
Load-bearing premise
The score test remains valid when the models differ only in their baseline distributions and the null model is correctly specified.
What would settle it
A Monte Carlo experiment that generates data from one of the candidate models, applies the MJ test, and checks whether the rejection rate under the global null stays at the nominal level.
Figures
read the original abstract
We propose specification tests for discriminating among non-nested parametric survival models with a cure fraction, focusing on models that differ only in their baseline distributions. The proposed approach augments the null log-likelihood with information from competing models and applies a score test to assess whether the additional information is redundant. Because the test relies only on restricted maximum likelihood estimates, it avoids fitting augmented models. For two competing models, the score statistic reduces to a quadratic form in the sample mean of the individual log-likelihood differences. We show that its signed square root coincides with Vuong's test statistic, although our framework differs in three important respects: it tests the specific null hypothesis that a given model is the true data-generating process, it uses an unsigned statistic that extends naturally to $M \ge 2$ competing models, and it estimates the Kullback-Leibler bias by parametric bootstrap. The resulting MJ statistic combines the individual J tests to assess the global null hypothesis that at least one candidate model is correctly specified, while also providing a model-selection criterion.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes J-type score tests for pairwise discrimination among non-nested parametric survival models with a cure fraction that differ only in baseline distributions, constructed by augmenting the null log-likelihood with terms from competing models and testing redundancy via the score statistic evaluated at restricted MLEs. For two models the resulting statistic is a quadratic form in the sample average of log-likelihood differences whose signed square root coincides with Vuong's statistic, while the unsigned version together with parametric-bootstrap bias correction extends directly to M models. The MJ statistic aggregates the individual J statistics to test the global null that at least one of the candidate models is correctly specified and simultaneously supplies a model-selection criterion.
Significance. If the asymptotic null distributions and finite-sample behavior hold, the approach supplies a computationally convenient specification test for cure-fraction survival models that avoids fitting augmented likelihoods, extends naturally to multiple competitors, and furnishes both pairwise and global testing plus selection. The reliance on restricted MLEs and the bootstrap bias correction are practical strengths.
minor comments (3)
- The three respects in which the framework differs from Vuong's test (specific null, unsigned statistic, bootstrap bias) are stated in the abstract but should be contrasted explicitly, perhaps in a short table, in the introduction.
- The regularity conditions required for the score test when the cure fraction is present (especially the behavior of the information matrix at the boundary) should be stated explicitly rather than left implicit under the 'models differ only in baseline' assumption.
- Simulation design and power comparisons against existing non-nested tests for cure models would strengthen the finite-sample evidence; if already present, a clearer summary table would help.
Simulated Author's Rebuttal
We thank the referee for the accurate and positive summary of our manuscript, which correctly captures the construction of the J-type score tests, their reduction to a Vuong-like form for two models, the extension via the MJ statistic to M models, and the use of parametric bootstrap for bias correction. We also appreciate the recognition of the computational advantages and the dual role as a specification test and model-selection criterion. The recommendation for minor revision is noted.
Circularity Check
No significant circularity; derivation follows standard score-test theory
full rationale
The paper constructs the J statistic by augmenting the null log-likelihood with terms from competing models and applying the score test for redundancy of that information. This reduces to a quadratic form in average log-likelihood differences by direct application of the score-test formula under the maintained null, without any fitted parameter being redefined as a prediction. The signed-root equivalence to Vuong's statistic is presented as a derived property rather than an input, and the MJ combination for the global null follows from standard multiple-testing aggregation without new regularity conditions or self-referential definitions. No load-bearing self-citations, uniqueness theorems, or ansatzes appear in the derivation chain. The approach remains self-contained against external score-test theory and is therefore scored at the default non-circularity level.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Joseph Berkson and Robert P Gage
doi: 10.1109/TAC.1974.1100705. Joseph Berkson and Robert P Gage. Survival curve for cancer patients following treatment. Journal of the American Statistical Association, 47(259):501–515,
-
[2]
doi: 10.1080/01621459. 1952.10501187. 20 John W Boag. Maximum likelihood estimates of the proportion of patients cured by cancer therapy.Journal of the Royal Statistical Society. Series B (Methodological), 11(1):15–53,
- [3]
- [4]
-
[5]
doi: 10.1201/b14832. F. Cribari-Neto and S. E. F. Lucena. Non-nested hypothesis testing inference for GAMLSS models.Journal of Statistical Computation and Simulation, 87(6):1189–1205,
-
[6]
doi: 10.1080/00949655.2016.1255946. R. Davidson and J. G. MacKinnon. Several tests for model specification in the presence of alternative hypotheses.Econometrica, 49:781–793,
-
[7]
doi: 10.22004/ag.econ.275156. A. C. Davison and D. V. Hinkley.Bootstrap Methods and Their Applications. Cambridge University Press, Cambridge,
-
[8]
doi: 10.1002/(SICI)1097-0258(19980430)17:8⟨831::AID-SIM790⟩3.0.CO;2-G. A. Hagemann. A simple test for regression specification with non-nested alternatives.Journal of Econometrics, 166:133–143,
-
[9]
Solomon Kullback and Richard A
doi: 10.1016/j.jeconom.2011.09.037. Solomon Kullback and Richard A. Leibler. On information and sufficiency.The Annals of Mathematical Statistics, 22(1):79–86,
-
[10]
doi: 10.1214/aoms/1177729694. R. A. Maller and X. Zhou.Survival Analysis with Long-Term Survivors. Wiley, Chichester,
-
[11]
Josemar Rodrigues, Vicente G Cancho, M´ ario de Castro, and Francisco Louzada-Neto
doi: 10.1177/0962280219893034. Josemar Rodrigues, Vicente G Cancho, M´ ario de Castro, and Francisco Louzada-Neto. On the unification of long-term survival models.Statistics and Probability Letters, 79(6):753–759,
-
[12]
Virginie Rondeau, Emmanuel Schaffner, Fabien Corbi` ere, Juan R
doi: 10.1016/j.spl.2008.10.029. Virginie Rondeau, Emmanuel Schaffner, Fabien Corbi` ere, Juan R. Gonzalez, and Simone Mathoulin-P´ elissier. Cure frailty models for survival data: Application to recurrences for breast cancer and to hospital readmissions for colorectal cancer.Statistical Methods in Medical Research, 22(3):243–260,
-
[13]
doi: 10.1177/0962280210395521. Gideon Schwarz. Estimating the dimension of a model.The Annals of Statistics, 6(2):461–464,
-
[14]
Chien-Lin Su, Sy Han Chiou, Feng-Chang Lin, and Robert W
doi: 10.1214/aos/1176344136. Chien-Lin Su, Sy Han Chiou, Feng-Chang Lin, and Robert W. Platt. Analysis of survival data with cure fraction and variable selection: A pseudo-observations approach.Statistical Methods in Medical Research, 31(11):2037–2053,
-
[15]
21 AD Tsodikov, JG Ibrahim, and AY Yakovlev
doi: 10.1177/09622802221108579. 21 AD Tsodikov, JG Ibrahim, and AY Yakovlev. Estimating cure rates from survival data: An alternative to two-component mixture models.Journal of the American Statistical Association, 98(464),
-
[16]
doi: 10.1198/01622145030000001007. Q. H. Vuong. Likelihood ratio tests for model selection and non-nested hypotheses.Econometrica, 57:307–333,
-
[17]
doi: 10.2307/1912557. Waloddi Weibull. A statistical distribution function of wide applicability.Journal of Applied Mechanics, 18(3):293–297,
-
[18]
doi: 10.2307/1912526. H. White.Estimation, Inference and Specification Analysis. Cambridge University Press,
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.