pith. sign in

arxiv: 2605.04086 · v1 · submitted 2026-04-23 · 📊 stat.ME

Focused Information Criteria for the Linear Hazard Regression Model

Pith reviewed 2026-05-09 21:08 UTC · model grok-4.3

classification 📊 stat.ME
keywords focused information criterionlinear hazard regressionAalen modelmodel selectioncumulative hazardmean squared errornonparametric survivalasymptotic approximation
0
0 comments X

The pith

The focused information criterion selects among linear hazard models by estimating the mean squared error of each model's cumulative hazard prediction at a given covariate vector.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces a selection method for candidate models within Aalen's linear hazard regression framework, a nonparametric approach to modeling survival times that avoids the proportional hazards assumption of the Cox model. It constructs the criterion by approximating the mean squared error of the estimated cumulative hazard function for any chosen covariate vector, then identifies the model that minimizes this estimated error. A reader would care because survival analysis often requires accurate cumulative hazard estimates for risk assessment, and no prior general tool existed for choosing among such nonparametric models. The paper also develops averaged versions of the criterion to handle selection across multiple focus points without committing to one specific covariate combination.

Core claim

The paper establishes that a focused information criterion for the linear hazard regression model can be built by deriving asymptotic expressions for the mean squared error of each candidate model's estimator of the cumulative hazard at a user-specified covariate vector, then selecting the model that yields the smallest such estimated error; averaged versions of the criterion are obtained by integrating or averaging the focus-specific quantities over a set of covariate vectors.

What carries the argument

The focused information criterion that estimates the mean squared error of the cumulative hazard estimator for each candidate linear hazard model at a fixed covariate vector.

If this is right

  • For any chosen covariate vector the method produces a model whose estimated cumulative hazard has the lowest approximated mean squared error among the candidates.
  • Averaged versions of the criterion allow selection of a single model that performs well on average across a distribution of covariate values.
  • The approach supplies a direct way to compare and rank nonparametric linear hazard models where standard likelihood-based criteria are unavailable.
  • Implementation requires only the usual asymptotic variance and bias terms already derived for Aalen's estimator.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same focused-error logic could be carried over to other additive hazard formulations or to settings with time-varying covariates.
  • In practice the criterion would let analysts tailor model choice to the covariate profile of a target population rather than using a global selector.
  • Software routines to compute the necessary asymptotic quantities would make the method immediately usable on standard survival datasets.

Load-bearing premise

The asymptotic approximations used to estimate the mean squared error of the cumulative hazard estimators remain accurate enough even under model misspecification or with moderate sample sizes.

What would settle it

A simulation experiment that generates data from a known true linear hazard model, applies the criterion to a collection of misspecified candidates, and checks whether the model with the lowest actual mean squared error for the cumulative hazard at the focus point is reliably chosen.

read the original abstract

The linear hazard regression model developed by Aalen is becoming an increasingly popular alternative to the Cox multiplicative hazard regression model. There are no methods in the literature for selecting among different candidate models of this nonparametric type, however. In the present paper a focused information criterion is developed for this task. The criterion works for each specified covariate vector, by estimating the mean squared error for each candidate model's estimate of the associated cumulative hazard rate; the finally selected model is the one with lowest estimated mean squared error. Averaged versions of the criterion are also developed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops a focused information criterion (FIC) for model selection among candidate models in Aalen's nonparametric linear hazard regression model. For any fixed covariate vector, the criterion estimates the mean squared error of each candidate model's estimator of the associated cumulative hazard function and selects the model minimizing this estimated MSE; averaged versions of the criterion are also derived.

Significance. If the asymptotic MSE estimators prove reliable, the work supplies the first focused model-selection tool for Aalen's additive hazard model, a popular nonparametric alternative to the Cox model. This addresses a clear methodological gap and could improve targeted estimation of cumulative hazards in survival applications where interest centers on specific covariate profiles.

major comments (2)
  1. [§3.2] §3.2, the asymptotic expansion of the MSE estimator (around Eq. (8)–(10)): the bias-correction term is derived under local misspecification, yet the paper does not establish that the estimator remains consistent for the true MSE when the misspecification is global; the martingale residual variance estimator used for the variance component is known to be consistent only under correct specification or local alternatives, leaving a potential non-vanishing bias in the estimated MSE that would undermine the selection guarantee.
  2. [§4.1] §4.1, finite-sample simulation design: the reported coverage and selection frequencies assume the local-misspecification regime used in the asymptotics; no results are shown for moderate-to-large misspecification or for small sample sizes where the integrated martingale residuals may produce substantial finite-sample bias in the MSE estimate, which is the weakest assumption identified for the central claim.
minor comments (2)
  1. [§2] The notation for the cumulative hazard estimator and its asymptotic variance could be made more explicit by distinguishing the true cumulative hazard Λ(t|x) from its estimator Λ̂(t|x) throughout §2 and §3.
  2. [Table 1] Table 1 and Figure 2 would benefit from clearer axis labels indicating whether the plotted quantities are estimated MSE or true MSE under the simulation truth.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the careful reading and constructive comments. We respond to each major comment below.

read point-by-point responses
  1. Referee: [§3.2] §3.2, the asymptotic expansion of the MSE estimator (around Eq. (8)–(10)): the bias-correction term is derived under local misspecification, yet the paper does not establish that the estimator remains consistent for the true MSE when the misspecification is global; the martingale residual variance estimator used for the variance component is known to be consistent only under correct specification or local alternatives, leaving a potential non-vanishing bias in the estimated MSE that would undermine the selection guarantee.

    Authors: The FIC derivation follows the standard local-misspecification framework used throughout the focused model selection literature to ensure that bias and variance contributions remain of comparable order in the limiting MSE. Under global misspecification the bias term would dominate, so the criterion would still favor models with smaller focused bias; however, we have not proved consistency of the MSE estimator itself in that regime, and the martingale-based variance estimator can retain bias. We will add an explicit statement of the local-misspecification assumption together with a remark on this limitation. revision: partial

  2. Referee: [§4.1] §4.1, finite-sample simulation design: the reported coverage and selection frequencies assume the local-misspecification regime used in the asymptotics; no results are shown for moderate-to-large misspecification or for small sample sizes where the integrated martingale residuals may produce substantial finite-sample bias in the MSE estimate, which is the weakest assumption identified for the central claim.

    Authors: The existing simulations are deliberately aligned with the local-misspecification asymptotics. We agree that additional experiments under moderate-to-large misspecification and smaller sample sizes would strengthen the empirical evidence. We will include these new scenarios in the revision. revision: yes

standing simulated objections not resolved
  • Proof of consistency of the MSE estimator under global misspecification

Circularity Check

0 steps flagged

FIC derivation for Aalen linear hazard model is self-contained via asymptotic MSE estimation

full rationale

The paper derives the focused information criterion by constructing an estimator of the mean squared error for the cumulative hazard at a fixed covariate vector, using standard asymptotic expansions for Aalen's additive hazard estimator based on martingale theory. This is an independent derivation adapting existing nonparametric asymptotics rather than reducing any prediction or selection rule to a fitted quantity or self-citation by construction. No load-bearing step equates the output criterion to its inputs; the method remains falsifiable against external data and benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Full details on assumptions and parameters are absent from the provided abstract; the criterion likely depends on standard asymptotic results for Aalen estimators.

axioms (1)
  • domain assumption Asymptotic normality and consistency hold for the cumulative hazard estimators in the linear hazard model
    The MSE estimation in FIC typically relies on large-sample theory for bias and variance approximations.

pith-pipeline@v0.9.0 · 5370 in / 1197 out tokens · 34059 ms · 2026-05-09T21:08:58.567410+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

  1. [1]

    and Keiding, N

    Andersen, P.K., Borgan, Ø., Gill, R. and Keiding, N. (1993).Statistical Models Based on Counting Processes.Springer-Verlag, Heidelberg

  2. [2]

    and Hjort, N.L

    Claeskens, G. and Hjort, N.L. (2003). The focused information criterion [with discussion],Journal of the American Statistical Association,98, 900–916 and 938–945

  3. [3]

    and Hjort, N.L

    Claeskens, G. and Hjort, N.L. (2007). Minimising average risk in regres- sion models.Econometric Theory, to appear

  4. [4]

    and Claeskens, G

    Hjort, N.L. and Claeskens, G. (2003). Frequentist average estimators [with discussion],Journal of the American Statistical Association,98, 879–899 and 938–945

  5. [5]

    and Claeskens, G

    Hjort, N.L. and Claeskens, G. (2006). Focused information criteria and model averaging for the Cox hazard regression model,Journal of the American Statistical Association,101, 1449–1464

  6. [6]

    and McKeague, I.W

    Huffer, F. and McKeague, I.W. (1991). Weighted least squares estima- tion for Aalen’s additive risk model.Journal of the American Statistical Association,86, 114–129

  7. [7]

    and Sasieni, P.D

    McKeague, I.W. and Sasieni, P.D. (1994). A partly parametric additive risk model,Biometrika,81, 501–514

  8. [8]

    and Scheike, Thomas H

    Martinussen, T. and Scheike, Thomas H. (2006).Dynamic Regression Models for Survival Data.Springer-Verlag, New York

  9. [9]

    Aalen, O.O. (1980). A model for nonparametric regression analysis of counting processes, inMathematical Statistics and Probability Theory (eds. W. Klonecki, A. Kozek and J. Rosinski). Proceedings of the 6th international conference, Wisla, Poland, 1–25