J- and MJ-Type Tests for Non-Nested Parametric Survival Models with a Cure Fraction: A Score Test Approach

Cynthia A. V. Tojeiro; Francisco Cribari-Neto; Tarciana L. Pereira; Tatiene C. Souza

arxiv: 2607.01379 · v1 · pith:OEJBJ7I7new · submitted 2026-07-01 · 📊 stat.ME

J- and MJ-Type Tests for Non-Nested Parametric Survival Models with a Cure Fraction: A Score Test Approach

Cynthia A. V. Tojeiro , Francisco Cribari-Neto , Tatiene C. Souza , Tarciana L. Pereira This is my paper

Pith reviewed 2026-07-03 19:23 UTC · model grok-4.3

classification 📊 stat.ME

keywords non-nested modelssurvival analysiscure fractionscore testmodel discriminationJ testMJ statisticparametric bootstrap

0 comments

The pith

Score tests on augmented log-likelihoods discriminate non-nested survival models with cure fractions using only null-model estimates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes J and MJ tests for choosing among non-nested parametric survival models that include a cure fraction and differ only in baseline distributions. The approach augments the null log-likelihood with information drawn from each competing model and then applies a score test to determine whether that added information is redundant. Because the procedure uses only restricted maximum likelihood estimates under the null, it avoids fitting any augmented models. The resulting MJ statistic pools the separate J tests to evaluate the global claim that at least one candidate model is correctly specified and simultaneously supplies a model-selection rule.

Core claim

The score statistic for two models reduces to a quadratic form in the sample mean of the individual log-likelihood differences; its signed square root coincides with Vuong's statistic, yet the test targets the specific null that a given model is the true data-generating process, employs an unsigned form that extends directly to M greater than or equal to 2 models, and estimates the Kullback-Leibler bias term by parametric bootstrap. The MJ statistic then combines the individual J tests to assess the global null that at least one candidate model is correctly specified while also serving as a model-selection criterion.

What carries the argument

The MJ statistic formed by combining individual J score tests applied to the null log-likelihood after augmentation with information from each competing model.

If this is right

The procedure requires only restricted maximum likelihood estimates from the null model.
For any pair of models the statistic is a quadratic form in the average log-likelihood difference.
The MJ statistic tests the global hypothesis that at least one of the candidate models is correctly specified.
The same statistic supplies a numerical criterion for selecting among the models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The bootstrap estimation of the bias term may improve finite-sample behavior compared with purely asymptotic approximations.
Analogous score-based tests could be constructed for non-nested models that differ in more than just the baseline hazard.
The global-null property of the MJ statistic may be useful in settings where several plausible cure-fraction specifications are under consideration simultaneously.

Load-bearing premise

The score test remains valid when the models differ only in their baseline distributions and the null model is correctly specified.

What would settle it

A Monte Carlo experiment that generates data from one of the candidate models, applies the MJ test, and checks whether the rejection rate under the global null stays at the nominal level.

Figures

Figures reproduced from arXiv: 2607.01379 by Cynthia A. V. Tojeiro, Francisco Cribari-Neto, Tarciana L. Pereira, Tatiene C. Souza.

**Figure 2.** Figure 2: Pointwise differences between the Kaplan–Meier estimates and the fitted LT survival [PITH_FULL_IMAGE:figures/full_fig_p018_2.png] view at source ↗

**Figure 3.** Figure 3: Kaplan–Meier estimates and fitted survival curves from the Generalized F-LT model [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗

read the original abstract

We propose specification tests for discriminating among non-nested parametric survival models with a cure fraction, focusing on models that differ only in their baseline distributions. The proposed approach augments the null log-likelihood with information from competing models and applies a score test to assess whether the additional information is redundant. Because the test relies only on restricted maximum likelihood estimates, it avoids fitting augmented models. For two competing models, the score statistic reduces to a quadratic form in the sample mean of the individual log-likelihood differences. We show that its signed square root coincides with Vuong's test statistic, although our framework differs in three important respects: it tests the specific null hypothesis that a given model is the true data-generating process, it uses an unsigned statistic that extends naturally to $M \ge 2$ competing models, and it estimates the Kullback-Leibler bias by parametric bootstrap. The resulting MJ statistic combines the individual J tests to assess the global null hypothesis that at least one candidate model is correctly specified, while also providing a model-selection criterion.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper develops J- and MJ-type score tests for non-nested parametric cure-fraction survival models by adapting the score-test idea to Vuong's framework with three explicit changes and a global combination rule.

read the letter

The main takeaway is that this paper supplies score-based J tests for comparing non-nested parametric survival models that include a cure fraction, plus an MJ statistic that tests the global null that at least one of the candidates is correct.

The construction augments the null log-likelihood with information from the other models and forms a score test for whether that information is redundant. For two models the resulting statistic reduces to a quadratic form in the average log-likelihood differences, and its signed root matches Vuong's statistic. The authors modify the setup in three stated ways: they test the explicit null that the given model is the true process, they work with an unsigned version that extends directly to M models, and they replace the usual bias term with a parametric bootstrap estimate. The MJ combination then pools the separate J statistics without adding new regularity conditions. This is new for the cure-fraction case and the specific modifications are incremental but targeted.

The approach does well on convenience. Because the test uses only the restricted maximum likelihood estimates under each null, it avoids fitting any enlarged model. The stress-test note confirms that the argument stays internally consistent under the maintained assumption that models differ only in baseline distributions.

The soft spots are limited and mostly practical. Everything rests on the models differing solely in their baselines; if the cure fraction parameters or other components also vary, the reduction to the simple quadratic form may not hold. The abstract gives no simulation results or explicit asymptotic derivations, so the full paper must still demonstrate that the null distribution is reliable in moderate samples and that the bootstrap bias correction performs adequately. Those are routine checks rather than load-bearing problems.

The paper is for statisticians who fit parametric cure models and want a formal, non-nested comparison tool that stays within the score-test framework. A reader already working in that subfield will find the MJ procedure usable. It deserves peer review because the core logic is coherent and the extension is direct, even though the scope stays narrow.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes J-type score tests for pairwise discrimination among non-nested parametric survival models with a cure fraction that differ only in baseline distributions, constructed by augmenting the null log-likelihood with terms from competing models and testing redundancy via the score statistic evaluated at restricted MLEs. For two models the resulting statistic is a quadratic form in the sample average of log-likelihood differences whose signed square root coincides with Vuong's statistic, while the unsigned version together with parametric-bootstrap bias correction extends directly to M models. The MJ statistic aggregates the individual J statistics to test the global null that at least one of the candidate models is correctly specified and simultaneously supplies a model-selection criterion.

Significance. If the asymptotic null distributions and finite-sample behavior hold, the approach supplies a computationally convenient specification test for cure-fraction survival models that avoids fitting augmented likelihoods, extends naturally to multiple competitors, and furnishes both pairwise and global testing plus selection. The reliance on restricted MLEs and the bootstrap bias correction are practical strengths.

minor comments (3)

The three respects in which the framework differs from Vuong's test (specific null, unsigned statistic, bootstrap bias) are stated in the abstract but should be contrasted explicitly, perhaps in a short table, in the introduction.
The regularity conditions required for the score test when the cure fraction is present (especially the behavior of the information matrix at the boundary) should be stated explicitly rather than left implicit under the 'models differ only in baseline' assumption.
Simulation design and power comparisons against existing non-nested tests for cure models would strengthen the finite-sample evidence; if already present, a clearer summary table would help.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the accurate and positive summary of our manuscript, which correctly captures the construction of the J-type score tests, their reduction to a Vuong-like form for two models, the extension via the MJ statistic to M models, and the use of parametric bootstrap for bias correction. We also appreciate the recognition of the computational advantages and the dual role as a specification test and model-selection criterion. The recommendation for minor revision is noted.

Circularity Check

0 steps flagged

No significant circularity; derivation follows standard score-test theory

full rationale

The paper constructs the J statistic by augmenting the null log-likelihood with terms from competing models and applying the score test for redundancy of that information. This reduces to a quadratic form in average log-likelihood differences by direct application of the score-test formula under the maintained null, without any fitted parameter being redefined as a prediction. The signed-root equivalence to Vuong's statistic is presented as a derived property rather than an input, and the MJ combination for the global null follows from standard multiple-testing aggregation without new regularity conditions or self-referential definitions. No load-bearing self-citations, uniqueness theorems, or ansatzes appear in the derivation chain. The approach remains self-contained against external score-test theory and is therefore scored at the default non-circularity level.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the method appears to rest on standard regularity conditions for score tests and on the assumption that models differ only in baseline distributions.

pith-pipeline@v0.9.1-grok · 5736 in / 1180 out tokens · 20188 ms · 2026-07-03T19:23:58.822650+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

Joseph Berkson and Robert P Gage

doi: 10.1109/TAC.1974.1100705. Joseph Berkson and Robert P Gage. Survival curve for cancer patients following treatment. Journal of the American Statistical Association, 47(259):501–515,

work page doi:10.1109/tac.1974.1100705 1974
[2]

Hoeffding, Probability inequalities for sums of bounded random variables, Journal of the American Statistical Association 58 (301) (1963) 13–30.doi:10.1080/01621459

doi: 10.1080/01621459. 1952.10501187. 20 John W Boag. Maximum likelihood estimates of the proportion of patients cured by cancer therapy.Journal of the Royal Statistical Society. Series B (Methodological), 11(1):15–53,

work page doi:10.1080/01621459 1952
[3]

URLhttps://www.jstor.org/stable/2983694. D. R. Cox. Tests of separate families of hypotheses.Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1:105–123,

work page arXiv
[4]

URL https://www.jstor.org/stable/2984232. D. R. Cox and D. V. Hinkley.Theoretical Statistics. Chapman and Hall/CRC, New York, 1 edition,

work page arXiv
[5]

doi: 10.1201/b14832. F. Cribari-Neto and S. E. F. Lucena. Non-nested hypothesis testing inference for GAMLSS models.Journal of Statistical Computation and Simulation, 87(6):1189–1205,

work page doi:10.1201/b14832
[6]

doi: 10.1080/00949655.2016.1255946. R. Davidson and J. G. MacKinnon. Several tests for model specification in the presence of alternative hypotheses.Econometrica, 49:781–793,

work page doi:10.1080/00949655.2016.1255946 2016
[7]

doi: 10.22004/ag.econ.275156. A. C. Davison and D. V. Hinkley.Bootstrap Methods and Their Applications. Cambridge University Press, Cambridge,

work page doi:10.22004/ag.econ.275156
[8]

doi: 10.1002/(SICI)1097-0258(19980430)17:8⟨831::AID-SIM790⟩3.0.CO;2-G. A. Hagemann. A simple test for regression specification with non-nested alternatives.Journal of Econometrics, 166:133–143,

work page doi:10.1002/(sici)1097-0258(19980430)17:8
[9]

Solomon Kullback and Richard A

doi: 10.1016/j.jeconom.2011.09.037. Solomon Kullback and Richard A. Leibler. On information and sufficiency.The Annals of Mathematical Statistics, 22(1):79–86,

work page doi:10.1016/j.jeconom.2011.09.037 2011
[10]

doi: 10.1214/aoms/1177729694. R. A. Maller and X. Zhou.Survival Analysis with Long-Term Survivors. Wiley, Chichester,

work page doi:10.1214/aoms/1177729694
[11]

Josemar Rodrigues, Vicente G Cancho, M´ ario de Castro, and Francisco Louzada-Neto

doi: 10.1177/0962280219893034. Josemar Rodrigues, Vicente G Cancho, M´ ario de Castro, and Francisco Louzada-Neto. On the unification of long-term survival models.Statistics and Probability Letters, 79(6):753–759,

work page doi:10.1177/0962280219893034
[12]

Virginie Rondeau, Emmanuel Schaffner, Fabien Corbi` ere, Juan R

doi: 10.1016/j.spl.2008.10.029. Virginie Rondeau, Emmanuel Schaffner, Fabien Corbi` ere, Juan R. Gonzalez, and Simone Mathoulin-P´ elissier. Cure frailty models for survival data: Application to recurrences for breast cancer and to hospital readmissions for colorectal cancer.Statistical Methods in Medical Research, 22(3):243–260,

work page doi:10.1016/j.spl.2008.10.029 2008
[13]

Gideon Schwarz

doi: 10.1177/0962280210395521. Gideon Schwarz. Estimating the dimension of a model.The Annals of Statistics, 6(2):461–464,

work page doi:10.1177/0962280210395521
[14]

Chien-Lin Su, Sy Han Chiou, Feng-Chang Lin, and Robert W

doi: 10.1214/aos/1176344136. Chien-Lin Su, Sy Han Chiou, Feng-Chang Lin, and Robert W. Platt. Analysis of survival data with cure fraction and variable selection: A pseudo-observations approach.Statistical Methods in Medical Research, 31(11):2037–2053,

work page doi:10.1214/aos/1176344136 2037
[15]

21 AD Tsodikov, JG Ibrahim, and AY Yakovlev

doi: 10.1177/09622802221108579. 21 AD Tsodikov, JG Ibrahim, and AY Yakovlev. Estimating cure rates from survival data: An alternative to two-component mixture models.Journal of the American Statistical Association, 98(464),

work page doi:10.1177/09622802221108579
[16]

doi: 10.1198/01622145030000001007. Q. H. Vuong. Likelihood ratio tests for model selection and non-nested hypotheses.Econometrica, 57:307–333,

work page doi:10.1198/01622145030000001007
[17]

Waloddi Weibull

doi: 10.2307/1912557. Waloddi Weibull. A statistical distribution function of wide applicability.Journal of Applied Mechanics, 18(3):293–297,

work page doi:10.2307/1912557
[18]

doi: 10.2307/1912526. H. White.Estimation, Inference and Specification Analysis. Cambridge University Press,

work page doi:10.2307/1912526

[1] [1]

Joseph Berkson and Robert P Gage

doi: 10.1109/TAC.1974.1100705. Joseph Berkson and Robert P Gage. Survival curve for cancer patients following treatment. Journal of the American Statistical Association, 47(259):501–515,

work page doi:10.1109/tac.1974.1100705 1974

[2] [2]

Hoeffding, Probability inequalities for sums of bounded random variables, Journal of the American Statistical Association 58 (301) (1963) 13–30.doi:10.1080/01621459

doi: 10.1080/01621459. 1952.10501187. 20 John W Boag. Maximum likelihood estimates of the proportion of patients cured by cancer therapy.Journal of the Royal Statistical Society. Series B (Methodological), 11(1):15–53,

work page doi:10.1080/01621459 1952

[3] [3]

URLhttps://www.jstor.org/stable/2983694. D. R. Cox. Tests of separate families of hypotheses.Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1:105–123,

work page arXiv

[4] [4]

URL https://www.jstor.org/stable/2984232. D. R. Cox and D. V. Hinkley.Theoretical Statistics. Chapman and Hall/CRC, New York, 1 edition,

work page arXiv

[5] [5]

doi: 10.1201/b14832. F. Cribari-Neto and S. E. F. Lucena. Non-nested hypothesis testing inference for GAMLSS models.Journal of Statistical Computation and Simulation, 87(6):1189–1205,

work page doi:10.1201/b14832

[6] [6]

doi: 10.1080/00949655.2016.1255946. R. Davidson and J. G. MacKinnon. Several tests for model specification in the presence of alternative hypotheses.Econometrica, 49:781–793,

work page doi:10.1080/00949655.2016.1255946 2016

[7] [7]

doi: 10.22004/ag.econ.275156. A. C. Davison and D. V. Hinkley.Bootstrap Methods and Their Applications. Cambridge University Press, Cambridge,

work page doi:10.22004/ag.econ.275156

[8] [8]

doi: 10.1002/(SICI)1097-0258(19980430)17:8⟨831::AID-SIM790⟩3.0.CO;2-G. A. Hagemann. A simple test for regression specification with non-nested alternatives.Journal of Econometrics, 166:133–143,

work page doi:10.1002/(sici)1097-0258(19980430)17:8

[9] [9]

Solomon Kullback and Richard A

doi: 10.1016/j.jeconom.2011.09.037. Solomon Kullback and Richard A. Leibler. On information and sufficiency.The Annals of Mathematical Statistics, 22(1):79–86,

work page doi:10.1016/j.jeconom.2011.09.037 2011

[10] [10]

doi: 10.1214/aoms/1177729694. R. A. Maller and X. Zhou.Survival Analysis with Long-Term Survivors. Wiley, Chichester,

work page doi:10.1214/aoms/1177729694

[11] [11]

Josemar Rodrigues, Vicente G Cancho, M´ ario de Castro, and Francisco Louzada-Neto

doi: 10.1177/0962280219893034. Josemar Rodrigues, Vicente G Cancho, M´ ario de Castro, and Francisco Louzada-Neto. On the unification of long-term survival models.Statistics and Probability Letters, 79(6):753–759,

work page doi:10.1177/0962280219893034

[12] [12]

Virginie Rondeau, Emmanuel Schaffner, Fabien Corbi` ere, Juan R

doi: 10.1016/j.spl.2008.10.029. Virginie Rondeau, Emmanuel Schaffner, Fabien Corbi` ere, Juan R. Gonzalez, and Simone Mathoulin-P´ elissier. Cure frailty models for survival data: Application to recurrences for breast cancer and to hospital readmissions for colorectal cancer.Statistical Methods in Medical Research, 22(3):243–260,

work page doi:10.1016/j.spl.2008.10.029 2008

[13] [13]

Gideon Schwarz

doi: 10.1177/0962280210395521. Gideon Schwarz. Estimating the dimension of a model.The Annals of Statistics, 6(2):461–464,

work page doi:10.1177/0962280210395521

[14] [14]

Chien-Lin Su, Sy Han Chiou, Feng-Chang Lin, and Robert W

doi: 10.1214/aos/1176344136. Chien-Lin Su, Sy Han Chiou, Feng-Chang Lin, and Robert W. Platt. Analysis of survival data with cure fraction and variable selection: A pseudo-observations approach.Statistical Methods in Medical Research, 31(11):2037–2053,

work page doi:10.1214/aos/1176344136 2037

[15] [15]

21 AD Tsodikov, JG Ibrahim, and AY Yakovlev

doi: 10.1177/09622802221108579. 21 AD Tsodikov, JG Ibrahim, and AY Yakovlev. Estimating cure rates from survival data: An alternative to two-component mixture models.Journal of the American Statistical Association, 98(464),

work page doi:10.1177/09622802221108579

[16] [16]

doi: 10.1198/01622145030000001007. Q. H. Vuong. Likelihood ratio tests for model selection and non-nested hypotheses.Econometrica, 57:307–333,

work page doi:10.1198/01622145030000001007

[17] [17]

Waloddi Weibull

doi: 10.2307/1912557. Waloddi Weibull. A statistical distribution function of wide applicability.Journal of Applied Mechanics, 18(3):293–297,

work page doi:10.2307/1912557

[18] [18]

doi: 10.2307/1912526. H. White.Estimation, Inference and Specification Analysis. Cambridge University Press,

work page doi:10.2307/1912526