pith. sign in

arxiv: 2512.10570 · v2 · submitted 2025-12-11 · 📊 stat.ML · cs.LG

Flexible Deep Neural Networks for Partially Linear Survival Data: Estimation and Survival Inference

Pith reviewed 2026-05-16 23:04 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords survival analysisdeep neural networkspartially linear modelssemiparametric inferencehazard regressionasymptotic normalitycross-fitting
0
0 comments X

The pith

A partially linear DNN model for survival data achieves optimal nonparametric rates, efficient linear estimates, and the first frequentist pointwise confidence intervals for the survival function.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FLEXI-Haz, which models the hazard as a sum of a linear term for primary covariates and a deep neural network term for complex interactions among the remaining variables. This structure avoids the proportional hazards assumption required by standard Cox models. The authors prove that the neural network component attains minimax-optimal rates over composite Hölder classes, the linear coefficients are sqrt-n consistent and semiparametrically efficient, and a cross-fitted one-step estimator produces asymptotically normal pointwise intervals for the cumulative hazard and survival function of new subjects. Simulations and real-data examples illustrate practical gains in flexibility and interpretability over purely proportional-hazards or fully nonparametric alternatives.

Core claim

In a partially linear survival model, the DNN nonparametric component converges at minimax-optimal rates over composite Hölder classes, the linear estimator is asymptotically normal and semiparametrically efficient, and cross-fitting yields a one-step estimator of the cumulative hazard whose pointwise asymptotic normality supplies valid confidence intervals for the survival function.

What carries the argument

The partially linear hazard specification with DNN nonparametric component together with cross-fitting to construct the one-step cumulative-hazard estimator.

Load-bearing premise

The hazard function is correctly specified as the sum of a linear term and a nonparametric function belonging to a composite Hölder class, with standard regularity conditions holding for the semiparametric asymptotics.

What would settle it

Generate data from a survival model whose hazard contains interactions between the designated linear covariates and the nonparametric covariates, then verify whether the empirical coverage of the proposed pointwise intervals falls substantially below the nominal level.

read the original abstract

We propose a flexible deep neural network (DNN) framework for modeling survival data within a partially linear regression structure. The approach preserves interpretability through a parametric linear component for covariates of primary interest, while a nonparametric DNN component captures complex time-covariate interactions among nuisance variables. We refer to the method as FLEXI-Haz, a FLEXIble Hazard model with a partially linear structure. In contrast to existing DNN approaches for partially linear Cox models, FLEXI-Haz does not rely on the proportional hazards assumption. We establish theoretical guarantees: the neural network component attains minimax-optimal convergence rates over composite H\"older classes, the linear estimator is sqrt-n-consistent, asymptotically normal, and semiparametrically efficient, and we develop a cross-fitted one-step estimator of the cumulative hazard and survival function for a new subject, together with pointwise asymptotic confidence intervals. To the best of our knowledge, this is the first frequentist asymptotic pointwise inference result for a survival function in a DNN survival model, with or without a linear component. Simulations and real-data analyses demonstrate the utility of FLEXI-Haz as a principled and interpretable alternative to methods based on proportional hazards.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper introduces FLEXI-Haz, a flexible deep neural network framework for partially linear survival data that does not assume proportional hazards. It establishes minimax-optimal convergence rates for the DNN component over composite Hölder classes, sqrt-n consistency, asymptotic normality, and semiparametric efficiency for the linear component, and proposes a cross-fitted one-step estimator for the cumulative hazard and survival function along with pointwise asymptotic confidence intervals. The work includes simulations and real-data analyses to demonstrate its utility.

Significance. If the theoretical results hold, this work is significant as it provides the first frequentist asymptotic pointwise inference results for survival functions in DNN survival models. It combines the flexibility of neural networks with interpretability of linear components, achieving optimal rates and efficient estimation without relying on the proportional hazards assumption, which is a common limitation in survival analysis.

minor comments (2)
  1. [Theoretical Results] The regularity conditions for semiparametric efficiency and cross-fitting could be listed more explicitly in the main theorem statements to facilitate verification.
  2. [Introduction] A more detailed comparison with existing DNN approaches for Cox models in the introduction would strengthen the motivation for avoiding the PH assumption.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading and positive assessment of the manuscript. We are pleased that the referee recognizes the significance of the first frequentist asymptotic pointwise inference results for survival functions in DNN-based survival models, as well as the combination of flexibility and interpretability without relying on the proportional hazards assumption. We will incorporate minor revisions to address any editorial suggestions.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper derives its asymptotic results for the DNN component, linear estimator, and survival function inference from standard semiparametric efficiency theory, neural-network approximation bounds over composite Hölder classes, and cross-fitting expansions. The one-step estimator for the cumulative hazard follows the usual influence-function expansion with remainders controlled by the established DNN rate; no equation reduces a claimed prediction or inference result to a fitted quantity by construction, and no load-bearing premise rests solely on self-citation. All steps invoke external regularity conditions and prior approximation theory that are independent of the target claims.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the partially linear hazard structure and on standard regularity conditions from semiparametric statistics and neural-network approximation theory; no new entities are postulated and no free parameters are fitted inside the theoretical statements themselves.

axioms (2)
  • domain assumption The hazard function admits a partially linear decomposition with the nonparametric component belonging to a composite Hölder class
    Invoked to obtain minimax-optimal rates for the DNN component and semiparametric efficiency for the linear component
  • standard math Standard regularity conditions for cross-fitting and asymptotic normality of one-step estimators hold
    Required for the claimed sqrt-n consistency and pointwise confidence intervals

pith-pipeline@v0.9.0 · 5513 in / 1589 out tokens · 42962 ms · 2026-05-16T23:04:52.770704+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

4 extracted references · 4 canonical work pages

  1. [1]

    Andersen, P. K. and R. D. Gill (1982). Cox’s regression model for counting processes: a large sample study.The annals of statistics, 1100–1120. Ben Arie, A. and M. Gorfine (2024). Confidence intervals and simultaneous confidence bands based on deep learning.Transactions on Machine Learning Research. Bickel, P. J., C. A. Klaassen, P. J. Bickel, Y. Ritov, J...

  2. [2]

    Ching, T., X

    Springer. Ching, T., X. Zhu, and L. X. Garmire (2018). Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data.PLOS Computational Biology 14(4), e1006076. Cox, D. R. (1972). Regression models and life-tables.Journal of the Royal Statistical Society: Series B (Methodological) 34(2), 187–220. Faraggi, D. and R. S...

  3. [3]

    Keret, N. and M. Gorfine (2023). Analyzing big ehr data—optimal cox regression subsampling procedure with rare events.Journal of the American Statistical Association 118(544), 2262–

  4. [4]

    Klein, J. P. and M. L. Moeschberger (2006).Survival Analysis: Techniques for Censored and Truncated Data(2 ed.). Springer. Kvamme, H., Ø. Borgan, and I. Scheel (2019). Time-to-event prediction with neural networks and cox regression.Journal of Machine Learning Research 20(129), 1–30. LeCun, Y., Y. Bengio, and G. Hinton (2015). Deep learning.Nature 521(755...