SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference

arxiv: 2605.15488 · v1 · submitted 2026-05-15 · 💻 cs.LG · stat.ML

SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference

Shi-ang Qi , Vahid Balazadeh , Michael Cooper , Russell Greiner , Rahul G. Krishnan This is my paper

Pith reviewed 2026-05-19 14:15 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords survival analysisin-context learningprior-data fitted networkscensored dataamortized Bayesian inferencetransformer modelstime-to-event prediction

0 comments p. Extension

The pith

SurvivalPFN amortizes Bayesian survival inference so a single pretrained network produces calibrated time-to-event predictions for new datasets in one forward pass.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SurvivalPFN as a prior-data fitted network pretrained on a wide range of synthetic right-censored data-generating processes. Its goal is to remove the need for users to select, tune, or retrain specialized survival estimators when faced with a new censored dataset. Instead, the model reads the new data as context and directly outputs a predictive survival distribution. If the pretraining succeeds in capturing the relevant statistical structure, the approach yields predictions that are competitive with or better than established methods across many real datasets while remaining free of strong parametric assumptions. The result is a practical foundation-style model for survival analysis that adapts to dataset complexity without additional optimization.

Core claim

SurvivalPFN is a transformer network trained to amortize Bayesian posterior inference for right-censored survival data; after pretraining on identifiable synthetic processes, it delivers calibrated survival distributions for previously unseen tasks through in-context learning alone.

What carries the argument

Prior-data fitted network performing in-context Bayesian inference on censored observations.

If this is right

Users no longer need domain expertise to choose or tune a survival model for each new dataset.
The same network handles datasets of different sizes and complexities without retraining.
Output distributions are calibrated rather than point estimates or uncalibrated probabilities.
The method supplies a single forward-pass solution suitable for high-stakes applications such as clinical decision support.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the synthetic pretraining distribution is sufficiently rich, the same architecture could be adapted to other forms of censored regression without changing the core training recipe.
Real-world performance might improve further if the model were allowed a small amount of fine-tuning on labeled real data while still preserving the zero-shot capability.
The approach suggests that in-context amortization may reduce the fragmentation of survival modeling into dozens of competing parametric families.

Load-bearing premise

Pretraining exclusively on diverse synthetic right-censored data will produce predictions that generalize to the distribution of real-world censored datasets.

What would settle it

A new collection of real-world survival datasets in which SurvivalPFN underperforms the best specialized baseline on the majority of the five evaluation metrics would falsify the generalization claim.

Figures

Figures reproduced from arXiv: 2605.15488 by Michael Cooper, Rahul G. Krishnan, Russell Greiner, Shi-ang Qi, Vahid Balazadeh.

**Figure 1.** Figure 1: Computational efficiency vs. performance across 61 datasets and 5 metrics. SurvivalPFN achieves the best median rank while matching classical models in speed. Survival analysis models the distribution of time to an event of interest, with applications spanning medicine [56, 92, 79, 9, 10, 20], e-commerce [59, 74, 16], engineering [72, 6, 51], and finance [60, 23, 25]. Such models are learned and evalua… view at source ↗

**Figure 2.** Figure 2: Traditional survival analysis vs. SurvivalPFN. (Left): Traditional survival analysis requires an analyst to select and fit a suitable estimator for the observational data. (Right): SurvivalPFN pre-trains on diverse synthetic, identifiable DGPs. At inference, an observed dataset is provided as context, and the survival distributions for query instances are obtained with a single forward pass. often rely on… view at source ↗

**Figure 3.** Figure 3: Training SurvivalPFN. At each iteration, we sample an identifiable survival DGP and use it to generate context tokens (X, T, ∆) together with query covariates X∗ . Query tokens are formed by pairing X∗ with query indicators ∆e ∗ , and SurvivalPFN predicts the requested event- or censoring-time distribution. The model is trained by minimizing the likelihood loss. 3 Method 3.1 SurvivalPFN: Amortized Posterio… view at source ↗

**Figure 4.** Figure 4: Summary over 500 generated datasets. (Left): Histogram of conditional mutual information. (Right): Diversity coverage over censoring rate and observed-time dispersion, colored by conditional observed-time entropy. To cover even more diverse survival regimes (see data diversity in [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Dataset size (cutoffs at 500 and 5000), censoring rate and tail rate (cutoffs at 33% and 67%). We evaluate SurvivalPFN on a large-scale benchmark covering diverse real-world regimes. The benchmark contains 81 datasets (see [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Model ranks across 61 benchmark datasets. Points/stars denote median ranks across datasets, with horizontal bars showing 95% bootstrap confidence intervals for the median rank. RQ2: Computational Efficiency [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Performance on the PBC dataset for SurvivalPFN and top-performing models. Shaded regions denote standard errors over 10 repeated runs. RQ3: Sensitivity to Training-Set Size [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of SurvivalPFN with selected general TFMs across 61 benchmark datasets. Plotting conventions follow [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Prior-quality diagnostics across four synthetic prior families, with 500 sampled datasets [PITH_FULL_IMAGE:figures/full_fig_p026_9.png] view at source ↗

**Figure 10.** Figure 10: Empirical distributions across 500 synthetic datasets. Each panel shows the induced marginal event-time survival curve P(E > t), censoring-time survival curve P(C > t), and observed Kaplan-Meier curve SbKM(t). Solid lines denote the pointwise median curve across generated datasets. Dark shaded bands denote the interquartile range (25th-75th percentiles), and light shaded bands denote the 10th-90th percent… view at source ↗

**Figure 11.** Figure 11: Model ranks across 24 small-size datasets (N < 500). Points/stars denote median ranks across datasets, with horizontal bars showing 95% bootstrap confidence intervals for the median rank. G.2 RQ2: Computational Efficiency For the runtime comparison in [PITH_FULL_IMAGE:figures/full_fig_p043_11.png] view at source ↗

**Figure 12.** Figure 12: Model ranks across 27 medium-size datasets (500 ≤ N < 5000). Points/stars denote median ranks across datasets, with horizontal bars showing 95% bootstrap confidence intervals for the median rank. For StaticSurvialTFM [46], it is a static fomula that can convert any classifier to survival predictor. We instantiate this static formulation with TabDPT and MITRA classifier backbones, predict failure probabili… view at source ↗

**Figure 13.** Figure 13: Model ranks across 10 medium-size datasets (N ≥ 5000). Points/stars denote median ranks across datasets, with horizontal bars showing 95% bootstrap confidence intervals for the median rank. G.5 RQ5: Ablation Studies SurvivalPFN Ablation Configurations [PITH_FULL_IMAGE:figures/full_fig_p045_13.png] view at source ↗

**Figure 14.** Figure 14: Model ranks across 12 low-censoring-rate datasets (censoring rate < 33%). Points/stars denote median ranks across datasets, with horizontal bars showing 95% bootstrap confidence intervals for the median rank. Variable train ratio specifies whether the ratio between context/training samples and query/inference samples is varied during synthetic pretraining. A value of ✓ means that this ratio is randomized … view at source ↗

**Figure 15.** Figure 15: Model ranks across 25 medium-censoring-rate datasets (censoring rate ≥ 33% and < 67%). Points/stars denote median ranks across datasets, with horizontal bars showing 95% bootstrap confidence intervals for the median rank. Second, the lognormal2normal transformation is preferred in this set of experiments. The clearest comparison is between v01 and v02, replacing lognormal2normal with time2quantile substan… view at source ↗

**Figure 16.** Figure 16: Model ranks across 24 high-censoring-rate datasets (censoring rate ≥ 67%). Points/stars denote median ranks across datasets, with horizontal bars showing 95% bootstrap confidence intervals for the median rank. failure probabilities, p(x, tk) = Pr(T ≤ tk | X = x), and hence the survival probabilities S(tk | x) = 1 − p(x, tk). This formulation is attractive because it can immediately use strong off-the-shel… view at source ↗

**Figure 17.** Figure 17: Sensitivity to the training/context ratio across selected 16 datasets. [PITH_FULL_IMAGE:figures/full_fig_p049_17.png] view at source ↗

**Figure 18.** Figure 18: Compare SurvivalPFN with general TFMs across 61 benchmark datasets. Plotting conventions follow [PITH_FULL_IMAGE:figures/full_fig_p050_18.png] view at source ↗

**Figure 19.** Figure 19: Ablation study over SurvivalPFN training configurations. Each row corresponds to one [PITH_FULL_IMAGE:figures/full_fig_p051_19.png] view at source ↗

read the original abstract

Survival analysis provides a powerful statistical framework for modeling time-to-event outcomes in the presence of censoring. However, selecting an appropriate estimator from the many specialized survival approaches often requires substantial methodological and domain expertise. We introduce SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for censored observations through in-context learning. SurvivalPFN is pretrained on a diverse family of synthetic, identifiable, and right-censored data-generating processes, enabling it to amortize survival analysis in a single forward pass during inference. As a result, the model adapts to the effective complexity of each dataset without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. In a large-scale benchmark spanning 61 datasets, 21 methods, and 5 evaluation metrics, SurvivalPFN achieves strong predictive performance and often improves upon established survival models. These results suggest that SurvivalPFN offers a principled and practical foundation model for survival analysis, with potential applications in high-impact domains such as healthcare, finance, and engineering (https://github.com/rgklab/SurvivalPFN).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SurvivalPFN pretrains a PFN on synthetic right-censored DGPs to amortize in-context survival predictions, and the 61-dataset benchmark shows competitive results without per-task tuning, but the synthetic-to-real transfer is the part that still needs tighter checks.

read the letter

The core move is straightforward: they take the prior-data fitted network idea and adapt it to right-censored survival data by generating a wide range of synthetic identifiable DGPs for pretraining. At test time the model does a single forward pass that conditions on the observed censored data in context and outputs a survival distribution. No new training or hyperparameter search per dataset. That part is clean and matches the abstract claim directly.

Referee Report

2 major / 2 minor

Summary. The paper introduces SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for right-censored survival data via in-context learning. Pretrained exclusively on a diverse family of synthetic, identifiable, right-censored data-generating processes, the model performs prediction in a single forward pass, adapts to dataset complexity without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. It reports strong predictive performance on a benchmark of 61 real datasets against 21 methods using 5 evaluation metrics, often improving upon established survival models.

Significance. If the central generalization claim holds, this represents a notable contribution as a practical foundation model for survival analysis that lowers the barrier to high-quality predictions in domains such as healthcare. The large-scale empirical evaluation (61 datasets, 21 baselines, 5 metrics) and the emphasis on amortized, calibration-aware inference are clear strengths that could influence both methodology and applied work.

major comments (2)

[§4] §4 (Experiments and Results): The headline performance claims on 61 real datasets rest on transfer from the synthetic pretraining distribution, yet the manuscript provides no ablations that vary censoring informativeness, dependence on covariates, or censoring rates outside the pretraining family; this is load-bearing for the assertion that in-context predictions generalize without retraining.
[§3] §3 (Method): The claim that the model 'avoids restrictive parametric assumptions' is not fully supported by the description of the prior family; the effective support of the synthetic DGPs over real-world joint distributions of covariates, event times, and censoring indicators needs explicit characterization to underwrite the Bayesian amortization argument.

minor comments (2)

The abstract and introduction would benefit from a concise statement of the precise prior family used in pretraining (e.g., ranges for censoring rates and dependence structures).
Figure captions should explicitly note whether reported metrics are averaged over multiple random seeds or data splits.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. The comments highlight important aspects of generalization and the characterization of our prior family, which we have addressed through targeted revisions and clarifications. We provide point-by-point responses below.

read point-by-point responses

Referee: [§4] §4 (Experiments and Results): The headline performance claims on 61 real datasets rest on transfer from the synthetic pretraining distribution, yet the manuscript provides no ablations that vary censoring informativeness, dependence on covariates, or censoring rates outside the pretraining family; this is load-bearing for the assertion that in-context predictions generalize without retraining.

Authors: We agree that additional ablations would strengthen the evidence for out-of-distribution generalization. In the revised manuscript, we have added a new subsection in §4 with synthetic ablations that systematically vary censoring rates (from 0% to 70%), censoring informativeness (independent vs. covariate-dependent), and covariate dependence structures outside the exact pretraining family. These results, presented in a new table and accompanying discussion, show that predictive performance and calibration remain stable, supporting the claim that in-context inference generalizes without retraining. The real-data benchmark on 61 datasets continues to serve as the primary empirical validation. revision: yes
Referee: [§3] §3 (Method): The claim that the model 'avoids restrictive parametric assumptions' is not fully supported by the description of the prior family; the effective support of the synthetic DGPs over real-world joint distributions of covariates, event times, and censoring indicators needs explicit characterization to underwrite the Bayesian amortization argument.

Authors: We have revised §3 to expand the characterization of the prior family. The updated text now details the mixture of parametric and semi-parametric components (including Weibull, log-normal, and Cox-like baselines with flexible censoring mechanisms), the identifiability constraints enforced during DGP sampling, and the coverage of joint distributions over covariates, event times, and censoring indicators. While a exhaustive theoretical mapping of the support onto all conceivable real-world distributions is not feasible within the scope of this work, the diversity and identifiability of the family, combined with strong empirical transfer to 61 heterogeneous real datasets, underwrite the amortization argument. We have also clarified that 'avoids restrictive parametric assumptions' refers to not committing to a single fixed parametric form at inference time, rather than claiming the prior itself is nonparametric. revision: partial

Circularity Check

0 steps flagged

No significant circularity; claims rest on external real-data benchmarks

full rationale

The paper pretrains SurvivalPFN exclusively on synthetic right-censored DGPs and then evaluates predictive performance on 61 independent real-world datasets against 21 baselines using 5 metrics. The headline result (strong performance, often improving on established models) is therefore an empirical comparison to external held-out data rather than a quantity defined by the model's own fitted parameters, synthetic pretraining statistics, or self-citations. No derivation step equates a prediction to its input by construction, renames a known result, or relies on a load-bearing self-citation whose validity is internal to the present work. The setup is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim depends on the untested transfer from synthetic identifiable DGPs to real censored data and on the assumption that a single forward pass through a fixed network can replace per-dataset Bayesian model selection.

free parameters (1)

PFN architecture and pretraining hyperparameters
Network depth, width, and training schedule are chosen to enable amortisation; these are fitted during the large-scale synthetic pretraining phase.

axioms (1)

domain assumption A sufficiently diverse collection of synthetic right-censored DGPs will induce a model whose in-context predictions are well-calibrated on real data.
Invoked when the authors state that pretraining enables adaptation without task-specific training.

pith-pipeline@v0.9.0 · 5736 in / 1284 out tokens · 44655 ms · 2026-05-19T14:15:41.115890+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

130 extracted references · 130 canonical work pages · 5 internal anchors

[1]

An introduction to mcmc for machine learning.Machine learning, 50(1):5–43, 2003

Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. An introduction to mcmc for machine learning.Machine learning, 50(1):5–43, 2003

work page 2003
[2]

Causalpfn: Amortized causal effect estimation via in-context learning.arXiv preprint arXiv:2506.07918, 2025

Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Benson Li, Junwei Ma, Jesse C Cresswell, and Rahul G Krishnan. Causalpfn: Amortized causal effect estimation via in-context learning.arXiv preprint arXiv:2506.07918, 2025

work page arXiv 2025
[3]

In-context learning of temporal point processes with foundation inference models.arXiv preprint arXiv:2509.24762, 2025

David Berghaus, Patrick Seifner, Kostadin Cvejoski, César Ojeda, and Ramsés J Sánchez. In-context learning of temporal point processes with foundation inference models.arXiv preprint arXiv:2509.24762, 2025

work page arXiv 2025
[4]

Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach

Elia Biganzoli, Patrizia Boracchi, Luigi Mariani, and Ettore Marubini. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Statistics in medicine, 17(10):1169–1186, 1998

work page 1998
[5]

Analysis of survival data under the proportional hazards model.Interna- tional Statistical Review/Revue Internationale de Statistique, pages 45–57, 1975

Norman E Breslow. Analysis of survival data under the proportional hazards model.Interna- tional Statistical Review/Revue Internationale de Statistique, pages 45–57, 1975

work page 1975
[6]

Survival analysis for success of molteno tube implants.British Journal of Ophthalmology, 85(6):689–695, 2001

DC Broadway, M Iester, M Schulzer, and GR Douglas. Survival analysis for success of molteno tube implants.British Journal of Ophthalmology, 85(6):689–695, 2001

work page 2001
[7]

Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone.BMC medical informatics and decision making, 20(1):16, 2020

Davide Chicco and Giuseppe Jurman. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone.BMC medical informatics and decision making, 20(1):16, 2020

work page 2020
[8]

Neural likelihoods via cumulative distribution functions

Pawel Chilinski and Ricardo Silva. Neural likelihoods via cumulative distribution functions. InConference on Uncertainty in Artificial Intelligence, pages 420–429. PMLR, 2020

work page 2020
[9]

Machine learning in computational histopathology: Challenges and opportunities.Genes, Chromosomes and Cancer, 62(9): 540–556, 2023

Michael Cooper, Zongliang Ji, and Rahul G Krishnan. Machine learning in computational histopathology: Challenges and opportunities.Genes, Chromosomes and Cancer, 62(9): 540–556, 2023

work page 2023
[10]

Dynameld: a dynamic model of end-stage liver disease for equitable prioritization.medRxiv, pages 2024–11, 2024

Michael J Cooper, Xiang Gao, Xun Zhao, Dariia Khoroshchuk, Yingke Wang, Amirhossein Azhie, Maryam Naghibzadeh, Sandra Holdsworth, Jed Adam Gross, Michael Brudno, et al. Dynameld: a dynamic model of end-stage liver disease for equitable prioritization.medRxiv, pages 2024–11, 2024

work page 2024
[11]

Regression models and life-tables.Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202, 1972

David R Cox. Regression models and life-tables.Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202, 1972

work page 1972
[12]

The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.Nature, 486 (7403):346–352, 2012

Christina Curtis, Sohrab P Shah, Suet-Feung Chin, Gulisa Turashvili, Oscar M Rueda, Mark J Dunning, Doug Speed, Andy G Lynch, Shamith Samarajiwa, Yinyin Yuan, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.Nature, 486 (7403):346–352, 2012

work page 2012
[13]

The road less scheduled.Advances in Neural Information Processing Systems, 37:9974–10007, 2024

Aaron Defazio, Xingyu Yang, Harsh Mehta, Konstantin Mishchenko, Ahmed Khaled, and Ashok Cutkosky. The road less scheduled.Advances in Neural Information Processing Systems, 37:9974–10007, 2024

work page 2024
[14]

Application of the theory of martingales.Le Calcul des Proba- bilites et ses Applications, pages 23–27, 1949

Joseph Leo Doob. Application of the theory of martingales.Le Calcul des Proba- bilites et ses Applications, pages 23–27, 1949. URL https://cir.nii.ac.jp/crid/ 1573387449499005824

work page 1949
[15]

Survset: An open-source time-to-event dataset repository.arXiv preprint arXiv:2203.03094, 2022

Erik Drysdale. Survset: An open-source time-to-event dataset repository.arXiv preprint arXiv:2203.03094, 2022

work page arXiv 2022
[16]

Modelling customer churn for the retail industry in a deep learning based sequential framework.arXiv preprint arXiv:2304.00575, 2023

Juan Pablo Equihua, Henrik Nordmark, Maged Ali, and Berthold Lausen. Modelling customer churn for the retail industry in a deep learning based sequential framework.arXiv preprint arXiv:2304.00575, 2023

work page arXiv 2023
[17]

Model-agnostic meta-learning for fast adaptation of deep networks

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. InInternational conference on machine learning, pages 1126–

work page
[18]

Deep Neural Networks for Survival Analysis Based on a Multi-Task Framework

Stephane Fotso. Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

PySurvival: Open source package for survival analysis modeling, 2019–

Stephane Fotso et al. PySurvival: Open source package for survival analysis modeling, 2019–. URLhttps://www.pysurvival.io/

work page 2019
[20]

Predicting long-term allograft survival in liver transplant recipients.arXiv preprint arXiv:2408.05437, 2024

Xiang Gao, Michael Cooper, Maryam Naghibzadeh, Amirhossein Azhie, Mamatha Bhat, and Rahul G Krishnan. Predicting long-term allograft survival in liver transplant recipients.arXiv preprint arXiv:2408.05437, 2024

work page arXiv 2024
[21]

A neural network-based survival analysis model considering censored data for failure prediction.IEEE Transactions on Automation Science and Engineering, 22:24585–24598, 2025

Yuanyuan Gao, Shuo Li, Di Wang, Jianming Mao, and Linhan Ouyang. A neural network-based survival analysis model considering censored data for failure prediction.IEEE Transactions on Automation Science and Engineering, 22:24585–24598, 2025

work page 2025
[22]

A scalable discrete-time survival model for neural networks.PeerJ, 7:e6257, 2019

Michael F Gensheimer and Balasubramanian Narasimhan. A scalable discrete-time survival model for neural networks.PeerJ, 7:e6257, 2019

work page 2019
[23]

The role of survival analysis in financial distress prediction

Adrian Gepp and Kuldeep Kumar. The role of survival analysis in financial distress prediction. International research journal of finance and economics, 16(16):13–34, 2008

work page 2008
[24]

Copula-based deep survival models for dependent censoring

Ali Hossein Foomani Gharari, Michael Cooper, Russell Greiner, and Rahul G Krishnan. Copula-based deep survival models for dependent censoring. InUncertainty in Artificial Intelligence, pages 669–680. PMLR, 2023

work page 2023
[25]

Ipos, trade sales and liquidations: Modelling venture capital exits using survival analysis.Journal of Banking & Finance, 31(3):679–702, 2007

Pierre Giot and Armin Schwienbacher. Ipos, trade sales and liquidations: Modelling venture capital exits using survival analysis.Journal of Banking & Finance, 31(3):679–702, 2007

work page 2007
[26]

Assessment and comparison of prognostic classification schemes for survival data.Statistics in medicine, 18 (17-18):2529–2545, 1999

Erika Graf, Claudia Schmoor, Willi Sauerbrei, and Martin Schumacher. Assessment and comparison of prognostic classification schemes for survival data.Statistics in medicine, 18 (17-18):2529–2545, 1999

work page 1999
[27]

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Ben- jamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, et al. Tabpfn-2.5: Advancing the state of the art in tabular foundation models.arXiv preprint arXiv:2511.08667, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[28]

A general framework for survival analysis and multi-state modelling.arXiv preprint arXiv:2006.04893, 2020

Stefan Groha, Sebastian M Schmon, and Alexander Gusev. A general framework for survival analysis and multi-state modelling.arXiv preprint arXiv:2006.04893, 2020

work page arXiv 2006
[29]

Effective ways to build and evaluate individual survival distributions.Journal of Machine Learning Research, 21(85): 1–63, 2020

Humza Haider, Bret Hoehn, Sarah Davis, and Russell Greiner. Effective ways to build and evaluate individual survival distributions.Journal of Machine Learning Research, 21(85): 1–63, 2020

work page 2020
[30]

Scott M Hammer, David A Katzenstein, Michael D Hughes, Holly Gundacker, Robert T Schooley, Richard H Haubrich, W Keith Henry, Michael M Lederman, John P Phair, Manette Niu, et al. A trial comparing nucleoside monotherapy with combination therapy in hiv-infected adults with cd4 cell counts from 200 to 500 per cubic millimeter.New England Journal of Medicin...

work page 1996
[31]

Inverse-weighted survival games.Advances in neural information processing systems, 34: 2160–2172, 2021

Xintian Han, Mark Goldstein, Aahlad Puli, Thomas Wies, Adler Perotte, and Rajesh Ranganath. Inverse-weighted survival games.Advances in neural information processing systems, 34: 2160–2172, 2021

work page 2021
[32]

Survival mixture density networks

Xintian Han, Mark Goldstein, and Rajesh Ranganath. Survival mixture density networks. In Machine Learning for Healthcare Conference, pages 224–248. PMLR, 2022

work page 2022
[33]

Evaluating the yield of medical tests.Jama, 247(18):2543–2546, 1982

Frank E Harrell, Robert M Califf, David B Pryor, Kerry L Lee, and Robert A Rosati. Evaluating the yield of medical tests.Jama, 247(18):2543–2546, 1982

work page 1982
[34]

Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in medicine, 15(4):361–387, 1996

Frank E Harrell Jr, Kerry L Lee, and Daniel B Mark. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in medicine, 15(4):361–387, 1996

work page 1996
[35]

Stochastic variational inference.the Journal of machine Learning research, 14(1):1303–1347, 2013

Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. Stochastic variational inference.the Journal of machine Learning research, 14(1):1303–1347, 2013. 11

work page 2013
[36]

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Noah Hollmann, Samuel Müller, Katharina Eggensperger, and Frank Hutter. Tabpfn: A transformer that solves small tabular classification problems in a second.arXiv preprint arXiv:2207.01848, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[37]

Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025

Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025

work page 2025
[38]

From tables to time: Extending tabpfn-v2 to time series forecasting.arXiv preprint arXiv:2501.02945, 2025

Shi Bin Hoo, Samuel Müller, David Salinas, and Frank Hutter. From tables to time: Extending tabpfn-v2 to time series forecasting.arXiv preprint arXiv:2501.02945, 2025

work page arXiv 2025
[39]

John Wiley & Sons, 2008

David W Hosmer Jr, Stanley Lemeshow, and Susanne May.Applied survival analysis: regression modeling of time-to-event data. John Wiley & Sons, 2008

work page 2008
[40]

Survival ensembles.Biostatistics, 7(3):355–373, 2006

Torsten Hothorn, Peter Bühlmann, Sandrine Dudoit, Annette Molinaro, and Mark J Van Der Laan. Survival ensembles.Biostatistics, 7(3):355–373, 2006

work page 2006
[41]

Random survival forests.Annals of Applied Statistics, 2(3):841–860, 2008

Hemant Ishwaran, Udaya B Kogalur, Eugene H Blackstone, Michael S Lauer, et al. Random survival forests.Annals of Applied Statistics, 2(3):841–860, 2008

work page 2008
[42]

Mimic-iv, a freely accessible electronic health record dataset.Scientific data, 10(1):1, 2023

Alistair EW Johnson, Lucas Bulgarelli, Lu Shen, Alvin Gayles, Ayad Shammout, Steven Horng, Tom J Pollard, Sicheng Hao, Benjamin Moody, Brian Gow, et al. Mimic-iv, a freely accessible electronic health record dataset.Scientific data, 10(1):1, 2023

work page 2023
[43]

An introduction to variational methods for graphical models.Machine learning, 37(2):183–233, 1999

Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, and Lawrence K Saul. An introduction to variational methods for graphical models.Machine learning, 37(2):183–233, 1999

work page 1999
[44]

Statistical survival analysis of male larynx-cancer patients-a case study

OJWF Kardaun. Statistical survival analysis of male larynx-cancer patients-a case study. Statistica neerlandica, 37(3):103–125, 1983

work page 1983
[45]

Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network.BMC medical research methodology, 18:1–12, 2018

Jared L Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger. Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network.BMC medical research methodology, 18:1–12, 2018

work page 2018
[46]

Tabular foundation models can do survival analysis.arXiv preprint arXiv:2601.22259, 2026

Da In Kim, Wei Siang Lai, and Kelly W Zhang. Tabular foundation models can do survival analysis.arXiv preprint arXiv:2601.22259, 2026

work page arXiv 2026
[47]

Meld 3.0: the model for end-stage liver disease updated for the modern era.Gastroenterology, 161(6): 1887–1895, 2021

W Ray Kim, Ajitha Mannalithara, Julie K Heimbach, Patrick S Kamath, Sumeet K Asrani, Scott W Biggins, Nicholas L Wood, Sommer E Gentry, and Allison J Kwong. Meld 3.0: the model for end-stage liver disease updated for the modern era.Gastroenterology, 161(6): 1887–1895, 2021

work page 2021
[48]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[49]

Springer, 2003

John P Klein and Melvin L Moeschberger.Survival analysis: techniques for censored and truncated data, volume 1230. Springer, 2003

work page 2003
[50]

Springer, 1996

David G Kleinbaum and Mitchel Klein.Survival analysis a self-learning text. Springer, 1996

work page 1996
[51]

Using survival analysis to evaluate medical equipment battery life.Biomedical instrumentation & technology, 50(3):184–189, 2016

David Kuhajda. Using survival analysis to evaluate medical equipment battery life.Biomedical instrumentation & technology, 50(3):184–189, 2016

work page 2016
[52]

Learning accurate personalized survival models for predicting hospital discharge and mortality of covid-19 patients.Scientific reports, 12(1):4472, 2022

Neeraj Kumar, Shi-ang Qi, Li-Hao Kuan, Weijie Sun, Jianfei Zhang, and Russell Greiner. Learning accurate personalized survival models for predicting hospital discharge and mortality of covid-19 patients.Scientific reports, 12(1):4472, 2022

work page 2022
[53]

Time-to-event prediction with neural networks and cox regression.Journal of machine learning research, 20(129):1–30, 2019

Håvard Kvamme, Ørnulf Borgan, and Ida Scheel. Time-to-event prediction with neural networks and cox regression.Journal of machine learning research, 20(129):1–30, 2019

work page 2019
[54]

Deephit: A deep learning approach to survival analysis with competing risks

Changhee Lee, William Zame, Jinsung Yoon, and Mihaela Van Der Schaar. Deephit: A deep learning approach to survival analysis with competing risks. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018. 12

work page 2018
[55]

Changhee Lee, Jinsung Yoon, and Mihaela Van Der Schaar. Dynamic-deephit: A deep learning approach for dynamic survival analysis with competing risks based on longitudinal data.IEEE Transactions on Biomedical Engineering, 67(1):122–133, 2019

work page 2019
[56]

Chunyang Li, Vikas Patil, Kelli M Rasmussen, Christina Yong, Hsu-Chih Chien, Debbie Morreall, Jeffrey Humpherys, Brian C Sauer, Zachary Burningham, and Ahmad S Halwani. Predicting survival in veterans with follicular lymphoma using structured electronic health record information and machine learning.International Journal of Environmental Research and Publ...

work page 2021
[57]

Gregory YH Lip, Robby Nieuwlaat, Ron Pisters, Deirdre A Lane, and Harry JGM Crijns. Re- fining clinical risk stratification for predicting stroke and thromboembolism in atrial fibrillation using a novel risk factor-based approach: the euro heart survey on atrial fibrillation.Chest, 137(2):263–272, 2010

work page 2010
[58]

Prospective evaluation of prognostic variables from patient-completed questionnaires

Charles Lawrence Loprinzi, John A Laurie, H Sam Wieand, James E Krook, Paul J Novotny, John W Kugler, Joan Bartel, Marlys Law, Marilyn Bateman, and Nancy E Klatt. Prospective evaluation of prognostic variables from patient-completed questionnaires. north central cancer treatment group.Journal of Clinical Oncology, 12(3):601–607, 1994

work page 1994
[59]

Junxiang Lu. Predicting customer churn in the telecommunications industry—-an application of survival analysis modeling using sas.SAS User Group International (SUGI27) Online Proceedings, 114:27, 2002

work page 2002
[60]

Survival analysis as a tool for company failure prediction

Martti Luoma and Erkki K Laitinen. Survival analysis as a tool for company failure prediction. Omega, 19(6):673–678, 1991

work page 1991
[61]

Tabdpt: Scaling tabular foundation models on real data.arXiv preprint arXiv:2410.18164, 2024

Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh, Alex Labach, Hamidreza Kamkari, Jesse C Cresswell, Keyvan Golestan, Guangwei Yu, Anthony L Caterini, and Maksims V olkovs. Tabdpt: Scaling tabular foundation models on real data.arXiv preprint arXiv:2410.18164, 2024

work page arXiv 2024
[62]

Estimation of parameters of mixed exponentially dis- tributed failure time distributions from censored life test data.Biometrika, 45(3-4):504–520, 1958

William Mendenhall and RJ Hader. Estimation of parameters of mixed exponentially dis- tributed failure time distributions from censored life test data.Biometrika, 45(3-4):504–520, 1958

work page 1958
[63]

Determination of prognosis in chronic disease, illustrated by systemic lupus erythematosus.Journal of chronic diseases, 1(1):12–32, 1955

Margaret Merrell and Lawrence E Shulman. Determination of prognosis in chronic disease, illustrated by systemic lupus erythematosus.Journal of chronic diseases, 1(1):12–32, 1955

work page 1955
[64]

Deep survival analysis: Nonparametrics and missingness

Xenia Miscouridou, Adler Perotte, Noémie Elhadad, and Rajesh Ranganath. Deep survival analysis: Nonparametrics and missingness. InMachine Learning for Healthcare Conference, pages 244–256. PMLR, 2018

work page 2018
[65]

Neuralsurv: Deep survival analysis with bayesian uncertainty quantification.arXiv preprint arXiv:2505.11054, 2025

Mélodie Monod, Alessandro Micheli, and Samir Bhatt. Neuralsurv: Deep survival analysis with bayesian uncertainty quantification.arXiv preprint arXiv:2505.11054, 2025

work page arXiv 2025
[66]

Transformers can do bayesian inference.arXiv preprint arXiv:2112.10510, 2021

Samuel Müller, Noah Hollmann, Sebastian Pineda Arango, Josif Grabocka, and Frank Hutter. Transformers can do bayesian inference.arXiv preprint arXiv:2112.10510, 2021

work page arXiv 2021
[67]

Position: The future of bayesian prediction is prior-fitted.arXiv preprint arXiv:2505.23947, 2025

Samuel Müller, Arik Reuter, Noah Hollmann, David Rügamer, and Frank Hutter. Position: The future of bayesian prediction is prior-fitted.arXiv preprint arXiv:2505.23947, 2025

work page arXiv 2025
[68]

Chirag Nagpal, Xinyu Li, and Artur Dubrawski. Deep survival machines: Fully parametric survival regression and representation learning for censored data with competing risks.IEEE Journal of Biomedical and Health Informatics, 25(8):3163–3175, 2021

work page 2021
[69]

Auton-survival: an open-source package for regression, counterfactual estimation, evaluation and phenotyping with censored time-to-event data

Chirag Nagpal, Willa Potosnak, and Artur Dubrawski. Auton-survival: an open-source package for regression, counterfactual estimation, evaluation and phenotyping with censored time-to-event data. InMachine Learning for Healthcare Conference, pages 585–608. PMLR, 2022

work page 2022
[70]

Springer Science & Business Media, 2012

Radford M Neal.Bayesian learning for neural networks, volume 118. Springer Science & Business Media, 2012. 13

work page 2012
[71]

deepaft: A nonlinear accelerated failure time model with artificial neural network.Statistics in Medicine, 43(19): 3689–3701, 2024

Patrick A Norman, Wanlu Li, Wenyu Jiang, and Bingshu E Chen. deepaft: A nonlinear accelerated failure time model with artificial neural network.Statistics in Medicine, 43(19): 3689–3701, 2024

work page 2024
[72]

Machine failure predic- tion using survival analysis.Future Internet, 15(5):153, 2023

Dimitris Papathanasiou, Konstantinos Demertzis, and Nikos Tziritas. Machine failure predic- tion using survival analysis.Future Internet, 15(5):153, 2023

work page 2023
[73]

Censored quantile regression neural networks for distribution-free survival analysis.Advances in neural information processing systems, 35: 7450–7461, 2022

Tim Pearce, Jong-Hyeon Jeong, Jun Zhu, et al. Censored quantile regression neural networks for distribution-free survival analysis.Advances in neural information processing systems, 35: 7450–7461, 2022

work page 2022
[74]

Churn prediction in mobile social games: Towards a complete assessment using survival ensembles

África Periáñez, Alain Saas, Anna Guitart, and Colin Magne. Churn prediction in mobile social games: Towards a complete assessment using survival ensembles. In2016 IEEE international conference on data science and advanced analytics (DSAA), pages 564–573. IEEE, 2016

work page 2016
[75]

Weibull distributions for continuous-carcinogenesis experiments

Richard Peto and Peter Lee. Weibull distributions for continuous-carcinogenesis experiments. Biometrics, pages 457–470, 1973

work page 1973
[76]

A car- diotoxicity dataset for breast cancer patients.Scientific Data, 10(1):527, 2023

Beatriz Pineiro-Lamas, Ana Lopez-Cheda, Ricardo Cao, Laura Ramos-Alonso, Gabriel Gonzalez-Barbeito, Cayetana Barbeito-Caamano, and Alberto Bouzas-Mosquera. A car- diotoxicity dataset for breast cancer patients.Scientific Data, 10(1):527, 2023

work page 2023
[77]

scikit-survival: A library for time-to-event analysis built on top of scikit- learn.Journal of Machine Learning Research, 21(212):1–6, 2020

Sebastian Pölsterl. scikit-survival: A library for time-to-event analysis built on top of scikit- learn.Journal of Machine Learning Research, 21(212):1–6, 2020

work page 2020
[78]

Fast training of support vector machines for survival analysis

Sebastian Pölsterl, Nassir Navab, and Amin Katouzian. Fast training of support vector machines for survival analysis. InJoint European conference on machine learning and knowledge discovery in databases, pages 243–259. Springer, 2015

work page 2015
[79]

Personalized breast cancer onset prediction from lifestyle and health history information.Plos one, 17(12):e0279174, 2022

Shi-ang Qi, Neeraj Kumar, Jian-Yi Xu, Jaykumar Patel, Sambasivarao Damaraju, Grace Shen- Tu, and Russell Greiner. Personalized breast cancer onset prediction from lifestyle and health history information.Plos one, 17(12):e0279174, 2022

work page 2022
[80]

An effective meaningful way to evaluate survival models

Shi-ang Qi, Neeraj Kumar, Mahtab Farrokh, Weijie Sun, Li-Hao Kuan, Rajesh Ranganath, Ricardo Henao, and Russell Greiner. An effective meaningful way to evaluate survival models. Proceedings of machine learning research, 202:28244, 2023

work page 2023

Showing first 80 references.

[1] [1]

An introduction to mcmc for machine learning.Machine learning, 50(1):5–43, 2003

Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. An introduction to mcmc for machine learning.Machine learning, 50(1):5–43, 2003

work page 2003

[2] [2]

Causalpfn: Amortized causal effect estimation via in-context learning.arXiv preprint arXiv:2506.07918, 2025

Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Benson Li, Junwei Ma, Jesse C Cresswell, and Rahul G Krishnan. Causalpfn: Amortized causal effect estimation via in-context learning.arXiv preprint arXiv:2506.07918, 2025

work page arXiv 2025

[3] [3]

In-context learning of temporal point processes with foundation inference models.arXiv preprint arXiv:2509.24762, 2025

David Berghaus, Patrick Seifner, Kostadin Cvejoski, César Ojeda, and Ramsés J Sánchez. In-context learning of temporal point processes with foundation inference models.arXiv preprint arXiv:2509.24762, 2025

work page arXiv 2025

[4] [4]

Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach

Elia Biganzoli, Patrizia Boracchi, Luigi Mariani, and Ettore Marubini. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Statistics in medicine, 17(10):1169–1186, 1998

work page 1998

[5] [5]

Analysis of survival data under the proportional hazards model.Interna- tional Statistical Review/Revue Internationale de Statistique, pages 45–57, 1975

Norman E Breslow. Analysis of survival data under the proportional hazards model.Interna- tional Statistical Review/Revue Internationale de Statistique, pages 45–57, 1975

work page 1975

[6] [6]

Survival analysis for success of molteno tube implants.British Journal of Ophthalmology, 85(6):689–695, 2001

DC Broadway, M Iester, M Schulzer, and GR Douglas. Survival analysis for success of molteno tube implants.British Journal of Ophthalmology, 85(6):689–695, 2001

work page 2001

[7] [7]

Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone.BMC medical informatics and decision making, 20(1):16, 2020

Davide Chicco and Giuseppe Jurman. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone.BMC medical informatics and decision making, 20(1):16, 2020

work page 2020

[8] [8]

Neural likelihoods via cumulative distribution functions

Pawel Chilinski and Ricardo Silva. Neural likelihoods via cumulative distribution functions. InConference on Uncertainty in Artificial Intelligence, pages 420–429. PMLR, 2020

work page 2020

[9] [9]

Machine learning in computational histopathology: Challenges and opportunities.Genes, Chromosomes and Cancer, 62(9): 540–556, 2023

Michael Cooper, Zongliang Ji, and Rahul G Krishnan. Machine learning in computational histopathology: Challenges and opportunities.Genes, Chromosomes and Cancer, 62(9): 540–556, 2023

work page 2023

[10] [10]

Dynameld: a dynamic model of end-stage liver disease for equitable prioritization.medRxiv, pages 2024–11, 2024

Michael J Cooper, Xiang Gao, Xun Zhao, Dariia Khoroshchuk, Yingke Wang, Amirhossein Azhie, Maryam Naghibzadeh, Sandra Holdsworth, Jed Adam Gross, Michael Brudno, et al. Dynameld: a dynamic model of end-stage liver disease for equitable prioritization.medRxiv, pages 2024–11, 2024

work page 2024

[11] [11]

Regression models and life-tables.Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202, 1972

David R Cox. Regression models and life-tables.Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202, 1972

work page 1972

[12] [12]

The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.Nature, 486 (7403):346–352, 2012

Christina Curtis, Sohrab P Shah, Suet-Feung Chin, Gulisa Turashvili, Oscar M Rueda, Mark J Dunning, Doug Speed, Andy G Lynch, Shamith Samarajiwa, Yinyin Yuan, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.Nature, 486 (7403):346–352, 2012

work page 2012

[13] [13]

The road less scheduled.Advances in Neural Information Processing Systems, 37:9974–10007, 2024

Aaron Defazio, Xingyu Yang, Harsh Mehta, Konstantin Mishchenko, Ahmed Khaled, and Ashok Cutkosky. The road less scheduled.Advances in Neural Information Processing Systems, 37:9974–10007, 2024

work page 2024

[14] [14]

Application of the theory of martingales.Le Calcul des Proba- bilites et ses Applications, pages 23–27, 1949

Joseph Leo Doob. Application of the theory of martingales.Le Calcul des Proba- bilites et ses Applications, pages 23–27, 1949. URL https://cir.nii.ac.jp/crid/ 1573387449499005824

work page 1949

[15] [15]

Survset: An open-source time-to-event dataset repository.arXiv preprint arXiv:2203.03094, 2022

Erik Drysdale. Survset: An open-source time-to-event dataset repository.arXiv preprint arXiv:2203.03094, 2022

work page arXiv 2022

[16] [16]

Modelling customer churn for the retail industry in a deep learning based sequential framework.arXiv preprint arXiv:2304.00575, 2023

Juan Pablo Equihua, Henrik Nordmark, Maged Ali, and Berthold Lausen. Modelling customer churn for the retail industry in a deep learning based sequential framework.arXiv preprint arXiv:2304.00575, 2023

work page arXiv 2023

[17] [17]

Model-agnostic meta-learning for fast adaptation of deep networks

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. InInternational conference on machine learning, pages 1126–

work page

[18] [18]

Deep Neural Networks for Survival Analysis Based on a Multi-Task Framework

Stephane Fotso. Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

PySurvival: Open source package for survival analysis modeling, 2019–

Stephane Fotso et al. PySurvival: Open source package for survival analysis modeling, 2019–. URLhttps://www.pysurvival.io/

work page 2019

[20] [20]

Predicting long-term allograft survival in liver transplant recipients.arXiv preprint arXiv:2408.05437, 2024

Xiang Gao, Michael Cooper, Maryam Naghibzadeh, Amirhossein Azhie, Mamatha Bhat, and Rahul G Krishnan. Predicting long-term allograft survival in liver transplant recipients.arXiv preprint arXiv:2408.05437, 2024

work page arXiv 2024

[21] [21]

A neural network-based survival analysis model considering censored data for failure prediction.IEEE Transactions on Automation Science and Engineering, 22:24585–24598, 2025

Yuanyuan Gao, Shuo Li, Di Wang, Jianming Mao, and Linhan Ouyang. A neural network-based survival analysis model considering censored data for failure prediction.IEEE Transactions on Automation Science and Engineering, 22:24585–24598, 2025

work page 2025

[22] [22]

A scalable discrete-time survival model for neural networks.PeerJ, 7:e6257, 2019

Michael F Gensheimer and Balasubramanian Narasimhan. A scalable discrete-time survival model for neural networks.PeerJ, 7:e6257, 2019

work page 2019

[23] [23]

The role of survival analysis in financial distress prediction

Adrian Gepp and Kuldeep Kumar. The role of survival analysis in financial distress prediction. International research journal of finance and economics, 16(16):13–34, 2008

work page 2008

[24] [24]

Copula-based deep survival models for dependent censoring

Ali Hossein Foomani Gharari, Michael Cooper, Russell Greiner, and Rahul G Krishnan. Copula-based deep survival models for dependent censoring. InUncertainty in Artificial Intelligence, pages 669–680. PMLR, 2023

work page 2023

[25] [25]

Ipos, trade sales and liquidations: Modelling venture capital exits using survival analysis.Journal of Banking & Finance, 31(3):679–702, 2007

Pierre Giot and Armin Schwienbacher. Ipos, trade sales and liquidations: Modelling venture capital exits using survival analysis.Journal of Banking & Finance, 31(3):679–702, 2007

work page 2007

[26] [26]

Assessment and comparison of prognostic classification schemes for survival data.Statistics in medicine, 18 (17-18):2529–2545, 1999

Erika Graf, Claudia Schmoor, Willi Sauerbrei, and Martin Schumacher. Assessment and comparison of prognostic classification schemes for survival data.Statistics in medicine, 18 (17-18):2529–2545, 1999

work page 1999

[27] [27]

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Ben- jamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, et al. Tabpfn-2.5: Advancing the state of the art in tabular foundation models.arXiv preprint arXiv:2511.08667, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[28] [28]

A general framework for survival analysis and multi-state modelling.arXiv preprint arXiv:2006.04893, 2020

Stefan Groha, Sebastian M Schmon, and Alexander Gusev. A general framework for survival analysis and multi-state modelling.arXiv preprint arXiv:2006.04893, 2020

work page arXiv 2006

[29] [29]

Effective ways to build and evaluate individual survival distributions.Journal of Machine Learning Research, 21(85): 1–63, 2020

Humza Haider, Bret Hoehn, Sarah Davis, and Russell Greiner. Effective ways to build and evaluate individual survival distributions.Journal of Machine Learning Research, 21(85): 1–63, 2020

work page 2020

[30] [30]

Scott M Hammer, David A Katzenstein, Michael D Hughes, Holly Gundacker, Robert T Schooley, Richard H Haubrich, W Keith Henry, Michael M Lederman, John P Phair, Manette Niu, et al. A trial comparing nucleoside monotherapy with combination therapy in hiv-infected adults with cd4 cell counts from 200 to 500 per cubic millimeter.New England Journal of Medicin...

work page 1996

[31] [31]

Inverse-weighted survival games.Advances in neural information processing systems, 34: 2160–2172, 2021

Xintian Han, Mark Goldstein, Aahlad Puli, Thomas Wies, Adler Perotte, and Rajesh Ranganath. Inverse-weighted survival games.Advances in neural information processing systems, 34: 2160–2172, 2021

work page 2021

[32] [32]

Survival mixture density networks

Xintian Han, Mark Goldstein, and Rajesh Ranganath. Survival mixture density networks. In Machine Learning for Healthcare Conference, pages 224–248. PMLR, 2022

work page 2022

[33] [33]

Evaluating the yield of medical tests.Jama, 247(18):2543–2546, 1982

Frank E Harrell, Robert M Califf, David B Pryor, Kerry L Lee, and Robert A Rosati. Evaluating the yield of medical tests.Jama, 247(18):2543–2546, 1982

work page 1982

[34] [34]

Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in medicine, 15(4):361–387, 1996

Frank E Harrell Jr, Kerry L Lee, and Daniel B Mark. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in medicine, 15(4):361–387, 1996

work page 1996

[35] [35]

Stochastic variational inference.the Journal of machine Learning research, 14(1):1303–1347, 2013

Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. Stochastic variational inference.the Journal of machine Learning research, 14(1):1303–1347, 2013. 11

work page 2013

[36] [36]

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Noah Hollmann, Samuel Müller, Katharina Eggensperger, and Frank Hutter. Tabpfn: A transformer that solves small tabular classification problems in a second.arXiv preprint arXiv:2207.01848, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[37] [37]

Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025

Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025

work page 2025

[38] [38]

From tables to time: Extending tabpfn-v2 to time series forecasting.arXiv preprint arXiv:2501.02945, 2025

Shi Bin Hoo, Samuel Müller, David Salinas, and Frank Hutter. From tables to time: Extending tabpfn-v2 to time series forecasting.arXiv preprint arXiv:2501.02945, 2025

work page arXiv 2025

[39] [39]

John Wiley & Sons, 2008

David W Hosmer Jr, Stanley Lemeshow, and Susanne May.Applied survival analysis: regression modeling of time-to-event data. John Wiley & Sons, 2008

work page 2008

[40] [40]

Survival ensembles.Biostatistics, 7(3):355–373, 2006

Torsten Hothorn, Peter Bühlmann, Sandrine Dudoit, Annette Molinaro, and Mark J Van Der Laan. Survival ensembles.Biostatistics, 7(3):355–373, 2006

work page 2006

[41] [41]

Random survival forests.Annals of Applied Statistics, 2(3):841–860, 2008

Hemant Ishwaran, Udaya B Kogalur, Eugene H Blackstone, Michael S Lauer, et al. Random survival forests.Annals of Applied Statistics, 2(3):841–860, 2008

work page 2008

[42] [42]

Mimic-iv, a freely accessible electronic health record dataset.Scientific data, 10(1):1, 2023

Alistair EW Johnson, Lucas Bulgarelli, Lu Shen, Alvin Gayles, Ayad Shammout, Steven Horng, Tom J Pollard, Sicheng Hao, Benjamin Moody, Brian Gow, et al. Mimic-iv, a freely accessible electronic health record dataset.Scientific data, 10(1):1, 2023

work page 2023

[43] [43]

An introduction to variational methods for graphical models.Machine learning, 37(2):183–233, 1999

Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, and Lawrence K Saul. An introduction to variational methods for graphical models.Machine learning, 37(2):183–233, 1999

work page 1999

[44] [44]

Statistical survival analysis of male larynx-cancer patients-a case study

OJWF Kardaun. Statistical survival analysis of male larynx-cancer patients-a case study. Statistica neerlandica, 37(3):103–125, 1983

work page 1983

[45] [45]

Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network.BMC medical research methodology, 18:1–12, 2018

Jared L Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger. Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network.BMC medical research methodology, 18:1–12, 2018

work page 2018

[46] [46]

Tabular foundation models can do survival analysis.arXiv preprint arXiv:2601.22259, 2026

Da In Kim, Wei Siang Lai, and Kelly W Zhang. Tabular foundation models can do survival analysis.arXiv preprint arXiv:2601.22259, 2026

work page arXiv 2026

[47] [47]

Meld 3.0: the model for end-stage liver disease updated for the modern era.Gastroenterology, 161(6): 1887–1895, 2021

W Ray Kim, Ajitha Mannalithara, Julie K Heimbach, Patrick S Kamath, Sumeet K Asrani, Scott W Biggins, Nicholas L Wood, Sommer E Gentry, and Allison J Kwong. Meld 3.0: the model for end-stage liver disease updated for the modern era.Gastroenterology, 161(6): 1887–1895, 2021

work page 2021

[48] [48]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[49] [49]

Springer, 2003

John P Klein and Melvin L Moeschberger.Survival analysis: techniques for censored and truncated data, volume 1230. Springer, 2003

work page 2003

[50] [50]

Springer, 1996

David G Kleinbaum and Mitchel Klein.Survival analysis a self-learning text. Springer, 1996

work page 1996

[51] [51]

Using survival analysis to evaluate medical equipment battery life.Biomedical instrumentation & technology, 50(3):184–189, 2016

David Kuhajda. Using survival analysis to evaluate medical equipment battery life.Biomedical instrumentation & technology, 50(3):184–189, 2016

work page 2016

[52] [52]

Learning accurate personalized survival models for predicting hospital discharge and mortality of covid-19 patients.Scientific reports, 12(1):4472, 2022

Neeraj Kumar, Shi-ang Qi, Li-Hao Kuan, Weijie Sun, Jianfei Zhang, and Russell Greiner. Learning accurate personalized survival models for predicting hospital discharge and mortality of covid-19 patients.Scientific reports, 12(1):4472, 2022

work page 2022

[53] [53]

Time-to-event prediction with neural networks and cox regression.Journal of machine learning research, 20(129):1–30, 2019

Håvard Kvamme, Ørnulf Borgan, and Ida Scheel. Time-to-event prediction with neural networks and cox regression.Journal of machine learning research, 20(129):1–30, 2019

work page 2019

[54] [54]

Deephit: A deep learning approach to survival analysis with competing risks

Changhee Lee, William Zame, Jinsung Yoon, and Mihaela Van Der Schaar. Deephit: A deep learning approach to survival analysis with competing risks. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018. 12

work page 2018

[55] [55]

Changhee Lee, Jinsung Yoon, and Mihaela Van Der Schaar. Dynamic-deephit: A deep learning approach for dynamic survival analysis with competing risks based on longitudinal data.IEEE Transactions on Biomedical Engineering, 67(1):122–133, 2019

work page 2019

[56] [56]

Chunyang Li, Vikas Patil, Kelli M Rasmussen, Christina Yong, Hsu-Chih Chien, Debbie Morreall, Jeffrey Humpherys, Brian C Sauer, Zachary Burningham, and Ahmad S Halwani. Predicting survival in veterans with follicular lymphoma using structured electronic health record information and machine learning.International Journal of Environmental Research and Publ...

work page 2021

[57] [57]

Gregory YH Lip, Robby Nieuwlaat, Ron Pisters, Deirdre A Lane, and Harry JGM Crijns. Re- fining clinical risk stratification for predicting stroke and thromboembolism in atrial fibrillation using a novel risk factor-based approach: the euro heart survey on atrial fibrillation.Chest, 137(2):263–272, 2010

work page 2010

[58] [58]

Prospective evaluation of prognostic variables from patient-completed questionnaires

Charles Lawrence Loprinzi, John A Laurie, H Sam Wieand, James E Krook, Paul J Novotny, John W Kugler, Joan Bartel, Marlys Law, Marilyn Bateman, and Nancy E Klatt. Prospective evaluation of prognostic variables from patient-completed questionnaires. north central cancer treatment group.Journal of Clinical Oncology, 12(3):601–607, 1994

work page 1994

[59] [59]

Junxiang Lu. Predicting customer churn in the telecommunications industry—-an application of survival analysis modeling using sas.SAS User Group International (SUGI27) Online Proceedings, 114:27, 2002

work page 2002

[60] [60]

Survival analysis as a tool for company failure prediction

Martti Luoma and Erkki K Laitinen. Survival analysis as a tool for company failure prediction. Omega, 19(6):673–678, 1991

work page 1991

[61] [61]

Tabdpt: Scaling tabular foundation models on real data.arXiv preprint arXiv:2410.18164, 2024

Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh, Alex Labach, Hamidreza Kamkari, Jesse C Cresswell, Keyvan Golestan, Guangwei Yu, Anthony L Caterini, and Maksims V olkovs. Tabdpt: Scaling tabular foundation models on real data.arXiv preprint arXiv:2410.18164, 2024

work page arXiv 2024

[62] [62]

Estimation of parameters of mixed exponentially dis- tributed failure time distributions from censored life test data.Biometrika, 45(3-4):504–520, 1958

William Mendenhall and RJ Hader. Estimation of parameters of mixed exponentially dis- tributed failure time distributions from censored life test data.Biometrika, 45(3-4):504–520, 1958

work page 1958

[63] [63]

Determination of prognosis in chronic disease, illustrated by systemic lupus erythematosus.Journal of chronic diseases, 1(1):12–32, 1955

Margaret Merrell and Lawrence E Shulman. Determination of prognosis in chronic disease, illustrated by systemic lupus erythematosus.Journal of chronic diseases, 1(1):12–32, 1955

work page 1955

[64] [64]

Deep survival analysis: Nonparametrics and missingness

Xenia Miscouridou, Adler Perotte, Noémie Elhadad, and Rajesh Ranganath. Deep survival analysis: Nonparametrics and missingness. InMachine Learning for Healthcare Conference, pages 244–256. PMLR, 2018

work page 2018

[65] [65]

Neuralsurv: Deep survival analysis with bayesian uncertainty quantification.arXiv preprint arXiv:2505.11054, 2025

Mélodie Monod, Alessandro Micheli, and Samir Bhatt. Neuralsurv: Deep survival analysis with bayesian uncertainty quantification.arXiv preprint arXiv:2505.11054, 2025

work page arXiv 2025

[66] [66]

Transformers can do bayesian inference.arXiv preprint arXiv:2112.10510, 2021

Samuel Müller, Noah Hollmann, Sebastian Pineda Arango, Josif Grabocka, and Frank Hutter. Transformers can do bayesian inference.arXiv preprint arXiv:2112.10510, 2021

work page arXiv 2021

[67] [67]

Position: The future of bayesian prediction is prior-fitted.arXiv preprint arXiv:2505.23947, 2025

Samuel Müller, Arik Reuter, Noah Hollmann, David Rügamer, and Frank Hutter. Position: The future of bayesian prediction is prior-fitted.arXiv preprint arXiv:2505.23947, 2025

work page arXiv 2025

[68] [68]

Chirag Nagpal, Xinyu Li, and Artur Dubrawski. Deep survival machines: Fully parametric survival regression and representation learning for censored data with competing risks.IEEE Journal of Biomedical and Health Informatics, 25(8):3163–3175, 2021

work page 2021

[69] [69]

Auton-survival: an open-source package for regression, counterfactual estimation, evaluation and phenotyping with censored time-to-event data

Chirag Nagpal, Willa Potosnak, and Artur Dubrawski. Auton-survival: an open-source package for regression, counterfactual estimation, evaluation and phenotyping with censored time-to-event data. InMachine Learning for Healthcare Conference, pages 585–608. PMLR, 2022

work page 2022

[70] [70]

Springer Science & Business Media, 2012

Radford M Neal.Bayesian learning for neural networks, volume 118. Springer Science & Business Media, 2012. 13

work page 2012

[71] [71]

deepaft: A nonlinear accelerated failure time model with artificial neural network.Statistics in Medicine, 43(19): 3689–3701, 2024

Patrick A Norman, Wanlu Li, Wenyu Jiang, and Bingshu E Chen. deepaft: A nonlinear accelerated failure time model with artificial neural network.Statistics in Medicine, 43(19): 3689–3701, 2024

work page 2024

[72] [72]

Machine failure predic- tion using survival analysis.Future Internet, 15(5):153, 2023

Dimitris Papathanasiou, Konstantinos Demertzis, and Nikos Tziritas. Machine failure predic- tion using survival analysis.Future Internet, 15(5):153, 2023

work page 2023

[73] [73]

Censored quantile regression neural networks for distribution-free survival analysis.Advances in neural information processing systems, 35: 7450–7461, 2022

Tim Pearce, Jong-Hyeon Jeong, Jun Zhu, et al. Censored quantile regression neural networks for distribution-free survival analysis.Advances in neural information processing systems, 35: 7450–7461, 2022

work page 2022

[74] [74]

Churn prediction in mobile social games: Towards a complete assessment using survival ensembles

África Periáñez, Alain Saas, Anna Guitart, and Colin Magne. Churn prediction in mobile social games: Towards a complete assessment using survival ensembles. In2016 IEEE international conference on data science and advanced analytics (DSAA), pages 564–573. IEEE, 2016

work page 2016

[75] [75]

Weibull distributions for continuous-carcinogenesis experiments

Richard Peto and Peter Lee. Weibull distributions for continuous-carcinogenesis experiments. Biometrics, pages 457–470, 1973

work page 1973

[76] [76]

A car- diotoxicity dataset for breast cancer patients.Scientific Data, 10(1):527, 2023

Beatriz Pineiro-Lamas, Ana Lopez-Cheda, Ricardo Cao, Laura Ramos-Alonso, Gabriel Gonzalez-Barbeito, Cayetana Barbeito-Caamano, and Alberto Bouzas-Mosquera. A car- diotoxicity dataset for breast cancer patients.Scientific Data, 10(1):527, 2023

work page 2023

[77] [77]

scikit-survival: A library for time-to-event analysis built on top of scikit- learn.Journal of Machine Learning Research, 21(212):1–6, 2020

Sebastian Pölsterl. scikit-survival: A library for time-to-event analysis built on top of scikit- learn.Journal of Machine Learning Research, 21(212):1–6, 2020

work page 2020

[78] [78]

Fast training of support vector machines for survival analysis

Sebastian Pölsterl, Nassir Navab, and Amin Katouzian. Fast training of support vector machines for survival analysis. InJoint European conference on machine learning and knowledge discovery in databases, pages 243–259. Springer, 2015

work page 2015

[79] [79]

Personalized breast cancer onset prediction from lifestyle and health history information.Plos one, 17(12):e0279174, 2022

Shi-ang Qi, Neeraj Kumar, Jian-Yi Xu, Jaykumar Patel, Sambasivarao Damaraju, Grace Shen- Tu, and Russell Greiner. Personalized breast cancer onset prediction from lifestyle and health history information.Plos one, 17(12):e0279174, 2022

work page 2022

[80] [80]

An effective meaningful way to evaluate survival models

Shi-ang Qi, Neeraj Kumar, Mahtab Farrokh, Weijie Sun, Li-Hao Kuan, Rajesh Ranganath, Ricardo Henao, and Russell Greiner. An effective meaningful way to evaluate survival models. Proceedings of machine learning research, 202:28244, 2023

work page 2023