pith. sign in

arxiv: 2605.16208 · v1 · pith:GEWCVYRCnew · submitted 2026-05-15 · 📊 stat.ML · cs.LG

A Scalable Nonparametric Continuous-Time Survival Model through Numerical Quadrature

Pith reviewed 2026-05-19 18:32 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords survival analysiscontinuous-time modelingnonparametric survivaldeep learningnumerical quadraturehazard estimationlow-rank adaptation
0
0 comments X p. Extension
pith:GEWCVYRC Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{GEWCVYRC}

Prints a linked pith:GEWCVYRC badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

QSurv approximates cumulative hazards via Gauss-Legendre quadrature to enable scalable nonparametric continuous-time survival modeling in deep networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces QSurv as a deep learning approach for modeling survival times continuously and nonparametrically. It replaces intractable integrals in the likelihood with a training objective based on Gauss-Legendre numerical quadrature, which approximates the cumulative hazard accurately enough for end-to-end backpropagation. The method also adds time-conditioned low-rank adaptation to let general neural backbones capture non-stationary hazard changes over time. Theoretical bounds on the approximation error are derived, and experiments show competitive performance on tabular and imaging datasets with better instantaneous hazard estimates.

Core claim

By replacing the intractable integral for the cumulative hazard with a Gauss-Legendre quadrature rule, a deep survival model can be trained end-to-end without time discretization or parametric distributional assumptions, while time-conditioned low-rank adaptation allows the network to represent time-varying hazards in complex architectures.

What carries the argument

Gauss-Legendre numerical quadrature applied to the cumulative hazard integral, paired with time-conditioned low-rank adaptation that modulates network weights dynamically with time.

If this is right

  • Models can be trained on high-dimensional inputs such as medical images while still producing instantaneous hazard estimates at arbitrary times.
  • The same quadrature objective applies to any neural backbone without requiring custom discretization schemes.
  • Non-stationary hazard patterns become directly interpretable through the time-conditioned adaptation mechanism.
  • Theoretical error bounds allow users to control approximation quality by choosing the quadrature order.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The quadrature approach could transfer to other likelihoods that involve integrals over time, such as intensity estimation in point processes.
  • Low-rank time conditioning might improve performance in related tasks like longitudinal regression or dynamic treatment regimes.
  • In clinical settings the resulting hazard curves could support finer-grained risk communication than discrete-time or parametric alternatives.

Load-bearing premise

The numerical quadrature approximates the cumulative hazard integral with high-order accuracy without introducing bias that affects model learning or predictions.

What would settle it

On synthetic data with a known closed-form cumulative hazard, check whether the quadrature-based training produces hazard estimates whose integrated error matches the theoretical bound or deviates systematically from the true function.

Figures

Figures reproduced from arXiv: 2605.16208 by Chaeyeon Lee, Hyungrok Do, Sehwan Kim.

Figure 1
Figure 1. Figure 1: Predicted instantaneous hazard functions on COVID-19-NY across representative risk [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Convergence of approximation error and training efficiency on two simulation scenarios. [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Instantaneous hazard, cumulative hazard, and survival functions for simulation scenario 1. [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Instantaneous hazard, cumulative hazard, and survival functions for simulation scenario 2. [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: ResNet-18 backbone architecture used for medical imaging survival modeling. The original [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for [PITH_FULL_IMAGE:figures/full_fig_p026_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for [PITH_FULL_IMAGE:figures/full_fig_p026_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for [PITH_FULL_IMAGE:figures/full_fig_p027_11.png] view at source ↗
read the original abstract

Flexible continuous-time survival modeling is critical for capturing complex time-varying hazard dynamics in high-dimensional data; however, training such models remains challenging due to the intractable integral required for likelihood estimation. We introduce QSurv, a scalable deep learning framework that enables nonparametric continuous-time modeling without relying on time discretization or restrictive distributional assumptions. We propose a training objective based on Gauss-Legendre numerical quadrature, which approximates the cumulative hazard with high-order accuracy while facilitating efficient end-to-end training via standard backpropagation. Furthermore, to effectively capture non-stationary hazard dynamics in complex architectures, we introduce time-conditioned low-rank adaptation, a mechanism that conditions general neural backbones on time by dynamically modulating weights via low-rank updates. We provide theoretical analysis establishing approximation error bounds for cumulative-hazard evaluation. Comprehensive experiments across synthetic benchmarks, large-scale real-world tabular datasets, and high-dimensional medical imaging tasks demonstrate that QSurv achieves competitive predictive performance with advantages in instantaneous hazard function estimation, enabling more interpretable characterization of time-varying risk patterns.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper introduces QSurv, a deep learning framework for nonparametric continuous-time survival modeling. It approximates the intractable cumulative hazard integral via Gauss-Legendre quadrature to enable end-to-end training without time discretization or parametric assumptions, and introduces time-conditioned low-rank adaptation to capture non-stationary hazard dynamics. Theoretical approximation error bounds are derived, and experiments on synthetic benchmarks, tabular datasets, and medical imaging tasks are reported to show competitive predictive performance with advantages in instantaneous hazard estimation.

Significance. If the quadrature delivers the claimed high-order accuracy without biasing the loss or gradients, and if the time-conditioned adaptation preserves sufficient regularity, the approach would provide a practical route to scalable, flexible continuous-time survival models that avoid discretization artifacts while supporting interpretable time-varying risk characterization on high-dimensional data.

major comments (1)
  1. [Theoretical analysis] Theoretical analysis (error-bound derivation): the claimed high-order accuracy of Gauss-Legendre quadrature for the cumulative-hazard integral requires the integrand (hazard function) to possess bounded higher-order derivatives up to order 2n. The time-conditioned low-rank adaptation modulates network weights dynamically with time, which can produce limited smoothness or rapid local variation; this regularity assumption is not automatically satisfied by the nonparametric architecture and is load-bearing for both the error bounds and the unbiasedness of back-propagated gradients.
minor comments (1)
  1. [Abstract] Abstract: quantitative results, error bars, and specific performance metrics are absent, making it difficult to assess the claimed competitive performance and advantages in hazard estimation.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We are grateful to the referee for the thoughtful and constructive feedback. Below we provide a point-by-point response to the major comment.

read point-by-point responses
  1. Referee: [Theoretical analysis] Theoretical analysis (error-bound derivation): the claimed high-order accuracy of Gauss-Legendre quadrature for the cumulative-hazard integral requires the integrand (hazard function) to possess bounded higher-order derivatives up to order 2n. The time-conditioned low-rank adaptation modulates network weights dynamically with time, which can produce limited smoothness or rapid local variation; this regularity assumption is not automatically satisfied by the nonparametric architecture and is load-bearing for both the error bounds and the unbiasedness of back-propagated gradients.

    Authors: We thank the referee for pointing out the critical regularity conditions required for the Gauss-Legendre quadrature error bounds. Our theoretical analysis derives the approximation error under the assumption that the hazard function h(t) is sufficiently smooth, i.e., that its derivatives up to order 2n are bounded. While the time-conditioned low-rank adaptation allows the model to capture non-stationary dynamics by modulating weights with time, we note that the overall hazard function is still a composition of neural network layers with smooth activation functions (such as ReLU or softplus, though we recommend smooth ones for theoretical guarantees). This composition preserves the necessary differentiability. To address the concern that rapid local variations could violate the assumptions, we will revise the manuscript to explicitly state these regularity conditions in the theoretical section and discuss how the low-rank updates can be constrained (e.g., via bounded weights) to maintain smoothness. Regarding the gradients: the quadrature approximation introduces a deterministic error in the loss, but the back-propagated gradients are exact with respect to the approximated objective. The error in the gradients is bounded by the quadrature error bound, ensuring consistency as the number of quadrature points increases. We will add a remark clarifying this point in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity: derivation uses independent numerical quadrature and novel components

full rationale

The paper defines its core training objective by applying the standard Gauss-Legendre quadrature rule to approximate the cumulative hazard integral, a technique drawn from external numerical analysis whose error properties and implementation are independent of the survival model parameters or fitted values. The time-conditioned low-rank adaptation is explicitly introduced as a new architectural mechanism rather than derived from or defined in terms of model outputs. Theoretical error bounds are stated to follow from the known quadrature remainder term under a smoothness assumption on the hazard function; this is an external regularity condition, not a self-referential redefinition of inputs. No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, prior self-citation, or ansatz smuggled from the authors' own work. The derivation chain therefore remains self-contained against external mathematical and computational benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The abstract supplies limited technical detail; the main structural additions are the quadrature training objective and the time-conditioned adaptation mechanism. No explicit free parameters are named.

axioms (1)
  • domain assumption Gauss-Legendre quadrature supplies a high-order accurate approximation to the integral defining the cumulative hazard.
    Directly invoked to justify the training objective and end-to-end backpropagation.
invented entities (1)
  • time-conditioned low-rank adaptation no independent evidence
    purpose: Dynamically modulate neural network weights via low-rank updates conditioned on time to capture non-stationary hazard dynamics.
    Introduced as a new mechanism to handle complex architectures without full retraining.

pith-pipeline@v0.9.0 · 5707 in / 1324 out tokens · 47584 ms · 2026-05-19T18:32:24.818777+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 4 internal anchors

  1. [1]

    Avati, T

    A. Avati, T. Duan, S. Zhou, K. Jung, N. H. Shah, and A. Y . Ng. Countdown regression: Sharp and calibrated survival predictions. In R. P. Adams and V . Gogate, editors,Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, volume 115 ofProceedings of Machine Learning Research, pages 145–155. PMLR, 22–25 Jul 2020

  2. [2]

    Bakas, H

    S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. S. Kirby, J. B. Freymann, K. Farahani, and C. Davatzikos. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features.Scientific data, 4(1):170117, 2017

  3. [3]

    Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

    S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, R. T. Shinohara, C. Berger, S. M. Ha, M. Rozycki, et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXiv preprint arXiv:1811.02629, 2018

  4. [4]

    Bennis, S

    A. Bennis, S. Mouysset, and M. Serrurier. Estimation of conditional mixture weibull distribution with right censored data using neural network for time-to-event analysis. InAdvances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Part I, Berlin, Heidelberg, 2020. Springer-Verlag. I...

  5. [5]

    N. E. Breslow and N. Chatterjee. Design and analysis of two-phase studies with binary outcome applied to wilms tumour prognosis.Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(4):457–468, 1999

  6. [6]

    A. F. Connors, N. V . Dawson, N. A. Desbiens, W. J. Fulkerson, L. Goldman, W. A. Knaus, J. Lynn, R. K. Oye, M. Bergner, A. Damiano, et al. A controlled trial to improve care for seriously iii hospitalized patients: The study to understand prognoses and preferences for outcomes and risks of treatments (support).Jama, 274(20):1591–1598, 1995

  7. [7]

    D. R. Cox. Regression models and life-tables.Journal of the Royal Statistical Society. Series B (Methodological), 34(2):187–220, 1972. ISSN 00359246. URL http://www.jstor.org/ stable/2985181

  8. [8]

    Craig, C

    E. Craig, C. Zhong, and R. Tibshirani. Survival stacking: casting survival analysis as a classification problem, 2021. URLhttps://arxiv.org/abs/2107.13480

  9. [9]

    Curtis, S

    C. Curtis, S. P. Shah, S.-F. Chin, G. Turashvili, O. M. Rueda, M. J. Dunning, D. Speed, A. G. Lynch, S. Samarajiwa, Y . Yuan, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.Nature, 486(7403):346–352, 2012

  10. [10]

    Danks and C

    D. Danks and C. Yau. Derivative-based neural modelling of cumulative distribution functions for survival analysis. In G. Camps-Valls, F. J. R. Ruiz, and I. Valera, editors,Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pages 7240–7256. PMLR, 28–30 Mar 2022...

  11. [11]

    P. J. Davis and P. Rabinowitz.Methods of numerical integration. Courier Corporation, 2007

  12. [12]

    Dispenzieri, J

    A. Dispenzieri, J. A. Katzmann, R. A. Kyle, D. R. Larson, T. M. Therneau, C. L. Colby, R. J. Clark, G. P. Mead, S. Kumar, L. J. Melton III, et al. Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population. InMayo Clinic Proceedings, volume 87, pages 517–523. Elsevier, 2012. 10

  13. [13]

    J. P. Donnelly, X. Q. Wang, T. J. Iwashyna, and H. C. Prescott. Readmission and death after initial hospital discharge among patients with covid-19 in a large multihospital system.Jama, 325(3):304–306, 2021

  14. [14]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020

  15. [15]

    Dumoulin, E

    V . Dumoulin, E. Perez, N. Schucher, F. Strub, H. d. Vries, A. Courville, and Y . Bengio. Feature- wise transformations.Distill, 3(7):e11, 2018

  16. [16]

    S. Fotso. Deep neural networks for survival analysis based on a multi-task framework.arXiv preprint arXiv:1801.05512, 2018

  17. [17]

    M. F. Gensheimer and B. Narasimhan. A scalable discrete-time survival model for neural networks.PeerJ, 7:e6257, 2019

  18. [18]

    G. H. Golub and J. H. Welsch. Calculation of gauss quadrature rules.Mathematics of computa- tion, 23(106):221–230, 1969

  19. [19]

    E. Graf, C. Schmoor, W. Sauerbrei, and M. Schumacher. Assessment and comparison of prognostic classification schemes for survival data.Statistics in medicine, 18(17-18):2529–2545, 1999

  20. [20]

    Haider, B

    H. Haider, B. Hoehn, S. Davis, and R. Greiner. Effective ways to build and evaluate individual survival distributions.Journal of Machine Learning Research, 21(85):1–63, 2020

  21. [21]

    X. Han, M. Goldstein, and R. Ranganath. Survival mixture density networks. InMachine Learning for Healthcare Conference, pages 224–248. PMLR, 2022

  22. [22]

    K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770– 778, 2016

  23. [23]

    M. A. Hernán. The hazards of hazard ratios.Epidemiology, 21(1):13–15, 2010

  24. [24]

    K. R. Hess and V . A. Levin. Getting more out of survival data by using the hazard function. Clinical Cancer Research, 20(6):1404–1409, 2014

  25. [25]

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chen, et al. Lora: Low-rank adaptation of large language models.ICLR, 1(2):3, 2022

  26. [26]

    Huang, Z

    G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017

  27. [27]

    Ishwaran, U

    H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer. Random survival forests. The Annals of Applied Statistics, 2(3):841 – 860, 2008. doi: 10.1214/08-AOAS169. URL https://doi.org/10.1214/08-AOAS169

  28. [28]

    T. J. Iwashyna, S. Seelye, T. S. Berkowitz, J. Pura, A. S. Bohnert, C. B. Bowling, E. J. Boyko, D. M. Hynes, G. N. Ioannou, M. L. Maciejewski, et al. Late mortality after covid-19 infection among us veterans vs risk-matched comparators: a 2-year cohort analysis.JAMA internal medicine, 183(10):1111–1119, 2023

  29. [29]

    E. L. Kaplan and P. Meier. Nonparametric estimation from incomplete observations.Journal of the American Statistical Association, 53(282):457–481, 1958. ISSN 01621459. URL http://www.jstor.org/stable/2281868

  30. [30]

    J. L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, and Y . Kluger. Deepsurv: person- alized treatment recommender system using a cox proportional hazards deep neural network. BMC medical research methodology, 18(1):24, 2018. 11

  31. [31]

    S. M. Kazemi, R. Goel, S. Eghbali, J. Ramanan, J. Sahota, S. Thakur, S. Wu, C. Smyth, P. Poupart, and M. Brubaker. Time2vec: Learning a vector representation of time.arXiv preprint arXiv:1907.05321, 2019

  32. [32]

    Kvamme, Ø

    H. Kvamme, Ø. Borgan, and I. Scheel. Time-to-event prediction with neural networks and cox regression.Journal of machine learning research, 20(129):1–30, 2019

  33. [33]

    C. Lee, W. Zame, J. Yoon, and M. Van Der Schaar. Deephit: A deep learning approach to survival analysis with competing risks. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

  34. [34]

    B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, Y . Burren, N. Porz, J. Slotboom, R. Wiest, et al. The multimodal brain tumor image segmentation benchmark (brats).IEEE Transactions on Medical Imaging, 34(10):1993–2024, 2015. doi: 10.1109/TMI.2014.2377694

  35. [35]

    Nagpal, X

    C. Nagpal, X. Li, and A. Dubrawski. Deep survival machines: Fully parametric survival regression and representation learning for censored data with competing risks.IEEE Journal of Biomedical and Health Informatics, 25(8):3163–3175, 2021

  36. [36]

    Nagpal, S

    C. Nagpal, S. Yadlowsky, N. Rostamzadeh, and K. Heller. Deep cox mixtures for survival regression. InMachine Learning for Healthcare Conference, pages 674–708. PMLR, 2021

  37. [37]

    About BioLINCC

    National Heart, Lung, and Blood Institute. About BioLINCC. Online, 2022. URL https: //biolincc.nhlbi.nih.gov/about/

  38. [38]

    Ranganath, A

    R. Ranganath, A. Perotte, N. Elhadad, and D. Blei. Deep survival analysis. In F. Doshi-Velez, J. Fackler, D. Kale, B. Wallace, and J. Wiens, editors,Proceedings of the 1st Machine Learning for Healthcare Conference, volume 56 ofProceedings of Machine Learning Research, pages 101–114, Northeastern University, Boston, MA, USA, 18–19 Aug 2016. PMLR

  39. [39]

    P. Royston. Flexible parametric alternatives to the cox model, and more.The Stata Journal, 1 (1):1–28, 2001

  40. [40]

    Schumacher, G

    M. Schumacher, G. Bastert, H. Bojar, K. Hübner, M. Olschewski, W. Sauerbrei, C. Schmoor, C. Beyerle, R. Neumann, and H. Rauschecker. Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. german breast cancer study group.Journal of Clinical Oncology, 12(10):2086–2093, 1994

  41. [41]

    M. J. Stensrud and M. A. Hernán. Why test for proportional hazards?Jama, 323(14):1401–1402, 2020

  42. [42]

    W. Tang, J. Ma, Q. Mei, and J. Zhu. Soden: A scalable continuous-time survival model through ordinary differential equation networks.Journal of Machine Learning Research, 23(34):1–29, 2022

  43. [43]

    W. Tang, K. He, G. Xu, and J. Zhu. Survival analysis via ordinary differential equations.Journal of the American Statistical Association, 118(544):2406–2421, 2023

  44. [44]

    H. Uno, T. Cai, M. J. Pencina, R. B. D’Agostino, and L.-J. Wei. On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data.Statistics in medicine, 30(10):1105–1117, 2011

  45. [45]

    Vaswani, N

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. InAdvances in neural information processing systems, volume 30, 2017

  46. [46]

    Wiegrebe, P

    S. Wiegrebe, P. Kopper, R. Sonabend, B. Bischl, and A. Bender. Deep learning for survival analysis: a review.Artificial Intelligence Review, 57(3), Feb. 2024. ISSN 1573-7462. doi: 10. 1007/s10462-023-10681-3. URLhttp://dx.doi.org/10.1007/s10462-023-10681-3

  47. [47]

    C.-N. Yu, R. Greiner, H.-C. Lin, and V . Baracos. Learning patient-specific cancer survival distributions as a sequence of dependent regressors.Advances in neural information processing systems, 24, 2011. 12

  48. [48]

    Zhong, J

    Q. Zhong, J. W. Mueller, and J.-L. Wang. Deep extended hazard models for survival analysis. Advances in Neural Information Processing Systems, 34:15111–15124, 2021. 13 A Gauss-Legendre Quadrature Gauss-Legendre quadrature approximates a definite integral by a weighted sum of function evaluations at carefully chosen nonuniform nodes. Unlike grid-based rule...