TabSurv: Adapting Modern Tabular Neural Networks to Survival Analysis

Andrei Konstantinov; Lev Utkin; Stanislav Kirpichenko

arxiv: 2605.03944 · v1 · submitted 2026-05-05 · 💻 cs.LG · cs.AI· stat.ML

TabSurv: Adapting Modern Tabular Neural Networks to Survival Analysis

Stanislav Kirpichenko , Andrei Konstantinov , Lev Utkin This is my paper

Pith reviewed 2026-05-07 00:39 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords survivaltabsurvanalysisdeeptabularmodernapproacharchitectures

0 comments

The pith

TabSurv demonstrates that modern tabular architectures plus a histogram loss for censored data yield higher average C-index than RSF, DeepSurv, DeepHit and SurvTRACE on ten survival datasets, with Weibull-parameterized ensembles ranking highest.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Survival analysis predicts how long until an event occurs when some observations are censored, meaning the event has not yet been seen. TabSurv takes standard tabular networks that already work well on ordinary tables and adds either a Weibull distribution head or a non-parametric output layer. Training uses a histogram loss that directly accounts for the censored cases instead of discarding them. The authors also train several such networks in parallel and average their predicted survival distributions rather than their parameters. On ten different real-world collections the approach records better average ranking by the C-index concordance measure than several established survival methods. The largest gains appear when the Weibull head is used inside the ensemble.

Core claim

Our results show that TabSurv consistently outperforms on average established classical and deep learning baselines, such as RSF, DeepSurv, DeepHit, SurvTRACE. Notably, deep ensembles with Weibull parametrization instead of non-parametric models achieve the highest average rank by C-index.

Load-bearing premise

That the reported average improvement on the ten chosen datasets will generalize to new tabular survival problems and that the novel SurvHL loss does not overfit the particular censoring patterns present in those datasets.

read the original abstract

Survival analysis on tabular data is a well-studied problem. However, existing deep learning methods are often highly task-specific, which can limit the transfer of new approaches from other domains and introduce constraints that may affect performance. We propose TabSurv, an approach that adapts modern tabular architectures to survival analysis using either the Weibull distribution or non-parametric survival prediction. TabSurv optimizes SurvHL, a novel histogram loss function supporting censored data. In addition to a baseline feed-forward network, we implement deep ensembles of MLPs for survival analysis within TabSurv. In contrast to prior work, the ensemble components are trained in parallel, optimizing survival distribution parameters before averaging, which promotes diversity across ensemble component predictions. We perform a comprehensive empirical evaluation of different proposed architectures on 10 diverse real-world survival datasets. Our results show that TabSurv consistently outperforms on average established classical and deep learning baselines, such as RSF, DeepSurv, DeepHit, SurvTRACE. Notably, deep ensembles with Weibull parametrization instead of non-parametric models achieve the highest average rank by C-index. Overall, our study clarifies how modern tabular neural networks can be adapted and trained to tackle survival analysis problems, offering a strong and reliable approach. The TabSurv implementation is publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TabSurv adds a censored histogram loss and parallel-trained distribution ensembles to modern tabular backbones, then reports the best mean C-index rank across ten datasets.

read the letter

The main things here are the SurvHL loss, which bins the survival distribution while explicitly handling censoring, and the parallel ensemble training that fits each member to distribution parameters before averaging. Both are concrete and appear new relative to the cited baselines like DeepHit and SurvTRACE. The paper also ships public code and runs the comparison on ten real datasets against RSF, DeepSurv, and the rest, which is useful for practitioners who need a drop-in tabular survival method rather than another bespoke architecture. That part is straightforward and worth having on record. The soft spots are exactly where the stress-test note flags them. The abstract gives no detail on dataset selection, whether SurvHL binning and weighting were locked before seeing the test sets, or any per-dataset variance or corrected significance numbers. An average-rank win can easily shift if any of those choices were post-hoc or dataset-specific. Because only the abstract is available, it is impossible to judge whether the Weibull ensemble edge is robust or an artifact of the particular collection. This is the kind of paper that belongs in a specialized survival-analysis or tabular-methods venue once the experimental protocol is fully documented. A serious referee should see it, but only after the authors supply the missing robustness checks and hyper-parameter schedules; without those the central claim stays provisional. I would bring it to a reading group for the loss formulation and ensemble trick, but I would not cite it yet.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes TabSurv, an adaptation of modern tabular neural architectures (including MLPs and deep ensembles) to survival analysis. It supports either parametric Weibull or non-parametric survival predictions and introduces a novel histogram loss SurvHL that accommodates right-censored data. Deep-ensemble members are trained in parallel on distribution parameters before averaging. A comprehensive evaluation on ten real-world datasets is reported to show that TabSurv variants, especially Weibull deep ensembles, obtain the highest average C-index rank, outperforming classical baselines (RSF) and prior deep methods (DeepSurv, DeepHit, SurvTRACE).

Significance. If the reported ranking proves robust, the work would offer a practical route for transferring recent tabular-model advances to survival tasks without task-specific architectural redesigns. The parallel-ensemble training and SurvHL loss are potentially reusable components. However, the abstract supplies no experimental protocol, so the practical significance cannot yet be evaluated.

major comments (2)

Abstract: the central claim that 'TabSurv consistently outperforms on average' and that 'deep ensembles with Weibull parametrization achieve the highest average rank' rests entirely on unreported experimental details. No information is given on how the ten datasets were selected or stratified by censoring rate/event density, how SurvHL binning/weighting hyperparameters were chosen or validated, or whether per-dataset variance and multiplicity-corrected significance tests accompany the average-rank comparison. These omissions render the headline empirical result unverifiable from the provided text.
Abstract: the description of the deep-ensemble procedure ('components are trained in parallel, optimizing survival distribution parameters before averaging') is too terse to determine whether the claimed diversity benefit is realized or whether it differs substantively from standard deep ensembles or from SurvTRACE-style approaches already in the literature.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No theoretical derivation is offered; the paper rests on standard neural-network assumptions plus the implicit claim that the new histogram loss is a valid proper scoring rule for censored survival data. No free parameters or invented entities are declared in the abstract.

pith-pipeline@v0.9.0 · 5509 in / 1108 out tokens · 32936 ms · 2026-05-07T00:39:59.407779+00:00 · methodology

TabSurv: Adapting Modern Tabular Neural Networks to Survival Analysis

Core claim

Load-bearing premise

discussion (0)