Aligning Data-Driven Predictors with Allocation: A Decision-Focused Approach to Survival Analysis
Pith reviewed 2026-06-28 15:16 UTC · model grok-4.3
The pith
Survival predictors optimized for C-index accuracy can produce allocation outcomes no better than random selection, but optimizing them for NDCG instead provides performance guarantees.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Any algorithm that relies on survival predictors optimized for standard metrics such as the C-index can yield arbitrarily poor outcomes when used for allocation, failing to guarantee utility better than uniform random selection. A decision-focused learning approach based on optimizing NDCG translates to guarantees on allocation performance, and a bootstrapping method allows existing survival models to be optimized for this metric while addressing right censorship.
What carries the argument
NDCG optimization of survival models via bootstrapping, which directly ties ranking quality to allocation utility and handles censored data in evaluation.
If this is right
- Allocation decisions based on these predictors achieve utility strictly better than random selection.
- On historical US heart transplant data the method produces 50-100% higher NDCG scores than baselines.
- The NDCG gains correspond to tens of thousands of additional life years gained annually in transplant allocation.
- The framework extends to other decision-making settings that use survival or ranking predictions.
Where Pith is reading between the lines
- The same NDCG-based alignment could be tested on non-medical allocation problems such as resource scheduling.
- Combining NDCG with other ranking-aware losses might improve robustness to different forms of censorship.
- Deployment on live allocation systems would require checking whether the bootstrapping step scales to larger datasets.
Load-bearing premise
That NDCG optimization of survival models via the proposed bootstrapping method provides allocation performance guarantees under right-censorship.
What would settle it
A test on held-out transplant data where an NDCG-optimized model produces allocation utility no higher than random selection would falsify the translation from NDCG to allocation guarantees.
Figures
read the original abstract
Machine learning predictors have become essential tools for guiding automated decision making. However, a major misalignment persists: predictive models are typically optimized in terms of standard statistical metrics in isolation from the algorithmic tasks they inform. We highlight this incongruity in the high-stakes domain of organ allocation by demonstrating that any algorithm relying on (even highly accurate) survival predictors optimized for standard metrics -- such as the Concordance index (C-index) -- can yield arbitrarily poor outcomes when used for allocation, failing to guarantee utility better than a uniform random selection. To bridge the gap between survival analysis and policy optimization, we introduce a decision-focused learning approach based on optimizing normalized discounted cumulative gain (NDCG), a mainstay metric in information retrieval. We establish the utility of NDCG in survival analysis by proving that it translates to guarantees on the performance of allocation. Empirically, we propose a bootstrapping approach to optimize the NDCG of existing survival models. Unlike prior work, we also address the challenge of right censorship when evaluating ranking. On historical heart transplant data from the US, our method dramatically boosts the NDCG of baseline models by 50-100%, which translates to tens of thousands of additional life years gained annually when deployed for transplant allocation. We anticipate that our framework will find broader applications in decision making with predictions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that survival predictors optimized for standard metrics such as the C-index can produce allocation policies (e.g., organ transplant prioritization) whose expected utility is arbitrarily close to that of uniform random selection. It introduces a decision-focused framework that instead optimizes normalized discounted cumulative gain (NDCG), proves that NDCG optimization yields allocation guarantees strictly better than random, proposes a bootstrapping procedure to optimize existing survival models for NDCG while addressing right-censorship in ranking evaluation, and reports 50-100% NDCG gains on US heart-transplant data that translate to tens of thousands of additional life-years annually.
Significance. If the NDCG-to-allocation guarantee holds under realistic right-censorship and the empirical translation is robust, the work supplies a concrete mechanism for aligning predictive models with downstream policy utility in high-stakes allocation domains. The explicit proof relating NDCG to allocation performance and the explicit treatment of censorship in ranking evaluation are strengths that distinguish the contribution from purely empirical decision-focused learning papers.
major comments (2)
- [Proof section / Theorem on NDCG-allocation equivalence] Proof of NDCG utility (likely §3 or Theorem 1): the argument that NDCG optimization guarantees allocation performance better than random must be shown to survive right-censorship. The current statement appears to rely on fully observed event times or on censoring that does not alter top-k ordering; if the proof only covers the uncensored case or assumes independent non-informative censoring, the guarantee does not transfer to the organ-allocation setting where censoring is common and potentially informative.
- [Empirical evaluation / bootstrapping description] Bootstrapping procedure and censorship handling (empirical section): the method for optimizing NDCG on existing models must specify exactly how censored observations are treated when computing the ranking metric used for gradient or surrogate optimization. Without this detail it is impossible to verify that the reported 50-100% NDCG lift is not an artifact of the particular imputation or weighting scheme chosen for the censored cases.
minor comments (2)
- [Method] Notation for the NDCG surrogate loss should be introduced once and used consistently; the current text mixes the ideal NDCG definition with the differentiable approximation without a clear mapping.
- [Experiments / discussion] The abstract states 'tens of thousands of additional life years gained annually'; the corresponding calculation (population size, life-year conversion factor, confidence interval) should appear in the main text or appendix so readers can assess sensitivity to the assumed allocation policy.
Simulated Author's Rebuttal
We thank the referee for these constructive comments on the proof's robustness under censoring and the need for explicit detail on the bootstrapping procedure. We address both points below and will revise the manuscript to strengthen clarity and reproducibility.
read point-by-point responses
-
Referee: [Proof section / Theorem on NDCG-allocation equivalence] Proof of NDCG utility (likely §3 or Theorem 1): the argument that NDCG optimization guarantees allocation performance better than random must be shown to survive right-censorship. The current statement appears to rely on fully observed event times or on censoring that does not alter top-k ordering; if the proof only covers the uncensored case or assumes independent non-informative censoring, the guarantee does not transfer to the organ-allocation setting where censoring is common and potentially informative.
Authors: Theorem 1 establishes the NDCG-to-allocation guarantee under the standard survival model with non-informative right-censoring (the maintained assumption throughout the paper and in the organ-allocation literature). The proof operates on the observed data distribution and shows that any ranking with higher NDCG yields strictly higher expected allocation utility than random selection; the NDCG itself is computed on the censored data via the ranking metric defined in Section 4. We will add an explicit remark after the theorem stating the non-informative censoring assumption and a short paragraph discussing the sensitivity of the guarantee to informative censoring. revision: partial
-
Referee: [Empirical evaluation / bootstrapping description] Bootstrapping procedure and censorship handling (empirical section): the method for optimizing NDCG on existing models must specify exactly how censored observations are treated when computing the ranking metric used for gradient or surrogate optimization. Without this detail it is impossible to verify that the reported 50-100% NDCG lift is not an artifact of the particular imputation or weighting scheme chosen for the censored cases.
Authors: Section 4 and the supplement describe the use of inverse-probability-of-censoring weighting (IPCW) when evaluating NDCG on right-censored data: each observation's contribution to the discounted cumulative gain is reweighted by the inverse of the estimated censoring survival function at the observed time. The bootstrapping procedure then optimizes this IPCW-NDCG surrogate. We will move the precise IPCW formula and the pseudocode for the weighted NDCG computation into the main text (currently only referenced) so that the optimization target is fully specified. revision: yes
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper's theoretical claim rests on an independent proof that NDCG optimization yields allocation performance guarantees, separate from any fitted parameters or prior self-citations. The empirical component applies bootstrapping to optimize NDCG on pre-existing survival models and directly addresses right-censorship in ranking evaluation, without reducing predictions to inputs by construction or relying on load-bearing self-citations. No self-definitional, fitted-input, or ansatz-smuggling patterns appear in the described derivation chain.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Optimizing NDCG leads to allocation performance guarantees
Reference graph
Works this paper leans on
-
[1]
Berrevoets, Jeroen and Jordon, James and Bica, Ioana and van der Schaar, Mihaela , booktitle=NeurIPS, year=. Organ
-
[2]
Operations Research , volume=
Fairness, efficiency, and flexibility in organ allocation for kidney transplantation , author=. Operations Research , volume=
-
[3]
2024 , month =
Update on Continuous Distribution of Hearts , howpublished =. 2024 , month =
2024
-
[4]
Learning to rank: from pairwise approach to listwise approach , booktitle=ICML, author=
-
[5]
Dynamic matching via weighted myopia with application to kidney exchange , author=
-
[6]
and Sandholm, Tuomas , booktitle=AAAI, year=
Dickerson, John P. and Sandholm, Tuomas , booktitle=AAAI, year=
-
[7]
Matthew Fahrbach and Zhiyi Huang and Runzhou Tao and Morteza Zadimoghaddam , title =
-
[8]
1999 , publisher=
Modern information retrieval , author=. 1999 , publisher=
1999
-
[9]
2008 , publisher=
Introduction to information retrieval , author=. 2008 , publisher=
2008
-
[10]
Biometrika , volume=
Asymptotic calibration , author=. Biometrika , volume=
-
[11]
Biometrika , volume=
Concordance probability and discriminatory power in proportional hazards regression , author=. Biometrika , volume=
-
[12]
Uno, Hajime and Cai, Tianxi and Pencina, Michael J and D'Agostino, Ralph B and Wei, Lee-Jen , journal=. On the
-
[13]
Journal of the American Medical Association (JAMA) , volume=
Evaluating the yield of medical tests , author=. Journal of the American Medical Association (JAMA) , volume=
-
[14]
Communications of the ACM , volume=
Algorithms with predictions , author=. Communications of the ACM , volume=
-
[15]
Learning queueing policies for organ transplantation allocation using interpretable counterfactual survival analysis , author=
-
[16]
Learning matching representations for individualized organ transplantation allocation , author=
-
[17]
Personalized donor-recipient matching for organ transplantation , author=
-
[18]
Closing the loop in medical decision support by understanding clinical decision-making: A case study on organ transplantation , author=
-
[19]
Transplant Centers , year =
-
[20]
Ke, Guolin and Meng, Qi and Finley, Thomas and Wang, Taifeng and Chen, Wei and Ma, Weidong and Ye, Qiwei and Liu, Tie-Yan , booktitle=NeurIPS, year=
-
[21]
Annals of Statistics , pages=
Greedy function approximation: a gradient boosting machine , author=. Annals of Statistics , pages=
-
[22]
An introduction to
Fawcett, Tom , journal=. An introduction to
-
[23]
Journal of the Royal Statistical Society: Series B (Methodological) , volume=
Regression models and life-tables , author=. Journal of the Royal Statistical Society: Series B (Methodological) , volume=
-
[24]
Journal of Machine Learning Research , volume=
scikit-survival: A library for time-to-event analysis built on top of scikit-learn , author=. Journal of Machine Learning Research , volume=
-
[25]
Posttransplant outcomes , year =
-
[26]
International Statistical Review , pages=
Analysis of survival data under the proportional hazards model , author=. International Statistical Review , pages=
-
[27]
Statistics in Medicine , volume=
Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors , author=. Statistics in Medicine , volume=
-
[28]
Lee, Changhee and Zame, William and Yoon, Jinsung and van der Schaar, Mihaela , booktitle=AAAI, year=
-
[29]
The international heart transplant survival algorithm (
Nilsson, Johan and Ohlsson, Mattias and H. The international heart transplant survival algorithm (. PloS one , volume=
-
[30]
Journal of Cardiac Surgery , volume=
Using machine learning to improve survival prediction after heart transplantation , author=. Journal of Cardiac Surgery , volume=
-
[31]
International Conference on Learning and Intelligent Optimization (LION) , year=
Sequential model-based optimization for general algorithm configuration , author=. International Conference on Learning and Intelligent Optimization (LION) , year=
-
[32]
Circulation , volume=
Policy Optimization for Dynamic Heart Transplant Allocation , author=. Circulation , volume=
-
[33]
Management Science , volume=
Dynamic matching: Characterizing and achieving constant regret , author=. Management Science , volume=
-
[34]
Theoretical Economics , volume=
Free riding and participation in large scale, multi-hospital kidney exchange , author=. Theoretical Economics , volume=
-
[35]
Games and Economic Behavior , volume=
Design and analysis of multi-hospital kidney exchange mechanisms using random graphs , author=. Games and Economic Behavior , volume=
-
[36]
Incentive-compatible kidney exchange in a slightly semi-random model , author=
-
[37]
Operations Research , volume=
On the optimality of greedy policies in dynamic matching , author=. Operations Research , volume=
-
[38]
Algorithms with calibrated machine learning predictions , author=
-
[39]
2025 , month =
OPTN , title =. 2025 , month =
2025
-
[40]
JAMA cardiology , volume=
Evolving trends in adult heart transplant with the 2018 heart allocation policy change , author=. JAMA cardiology , volume=
2018
-
[41]
Changes in the
Shore, Supriya and Golbus, Jessica R and Aaronson, Keith D and Nallamothu, Brahmajee K , journal=. Changes in the. 2020 , publisher=
2020
-
[42]
Burges, Christopher JC , journal=
-
[43]
Learning to rank for optimal treatment allocation under resource constraints , author=
-
[44]
ACM SIGecom Exchanges , volume=
Online matching: A brief survey , author=. ACM SIGecom Exchanges , volume=
-
[45]
Online vertex-weighted bipartite matching and single-bid budgeted allocations , author=
-
[46]
ACM Conference on Knowledge Discovery and Data Mining (KDD) , year=
Optimizing search engines using clickthrough data , author=. ACM Conference on Knowledge Discovery and Data Mining (KDD) , year=
-
[47]
American Journal of Transplantation , volume=
Understanding the Transplant Community's Priorities in Heart Allocation , author=. American Journal of Transplantation , volume=
-
[48]
Predicting clicks: estimating the click-through rate for new ads , author=
-
[49]
Machine Intelligence 15 , pages=
A Framework for Behavioural Cloning , author=. Machine Intelligence 15 , pages=
-
[50]
Efficient reductions for imitation learning , author=
-
[51]
, author=
Large margin methods for structured and interdependent output variables. , author=. Journal of Machine Learning Research , volume=
-
[52]
American Journal of Transplantation , volume=
Risk prediction models for survival after heart transplantation: a systematic review , author=. American Journal of Transplantation , volume=
-
[53]
Frontiers in Cardiovascular Medicine , volume=
Donor shortage in heart transplantation: How can we overcome this challenge? , author=. Frontiers in Cardiovascular Medicine , volume=
-
[54]
Operations Research , volume=
Reshaping national organ allocation policy , author=. Operations Research , volume=
-
[55]
An early investigation of outcomes with the new 2018 donor heart allocation system in the
Cogswell, Rebecca and John, Ranjit and Estep, Jerry D and Duval, Sue and Tedford, Ryan J and Pagani, Francis D and Martin, Cindy M and Mehra, Mandeep R , journal=. An early investigation of outcomes with the new 2018 donor heart allocation system in the
2018
-
[56]
Manufacturing & Service Operations Management , volume=
Patient choice in kidney allocation: The role of the queueing discipline , author=. Manufacturing & Service Operations Management , volume=
-
[57]
Donti and J
Priya L. Donti and J. Zico Kolter and Brandon Amos , title =
-
[58]
The well-calibrated
Dawid, A Philip , journal=. The well-calibrated
-
[59]
Performative prediction , author=
-
[60]
Calibration in deep learning: A survey of the state-of-the-art , author=. arXiv:2308.01222 , year=
-
[61]
Necessary Optimality Conditions for Integrated Learning and Optimization Problem in Contextual Optimization , author=. arXiv:2601.16581 , year=
-
[62]
Online Decision-Focused Learning , author=. arXiv:2505.13564 , year=
-
[63]
Decision-focused learning: Foundations, state of the art, benchmark and future opportunities , author=
-
[64]
predict, then optimize
Smart "predict, then optimize" , author=. Management Science , volume=
-
[65]
Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization , author=
-
[66]
Journal of Hepatology , volume=
Transplant benefit-based offering of deceased donor livers in the United Kingdom , author=. Journal of Hepatology , volume=. 2024 , publisher=
2024
-
[67]
Journal of the American Medical Association (JAMA) , volume=
Development and validation of a risk score predicting death without transplant in adult heart transplant candidates , author=. Journal of the American Medical Association (JAMA) , volume=
-
[68]
An optimal algorithm for on-line bipartite matching , author=
-
[69]
The adwords problem: Online keyword matching with budgeted bidders under random permutations , author=
-
[70]
Automated channel abstraction for advertising auctions , author=
-
[71]
Scalable segment abstraction method for advertising campaign admission and inventory allocation optimization , author=
-
[72]
2005 , booktitle=
Optimize-and-dispatch architecture for expressive ad auctions , author=. 2005 , booktitle=
2005
-
[73]
Online stochastic optimization in the large: Application to kidney exchange , author=
-
[74]
2009 , organization=
Online ad assignment with free disposal , author=. 2009 , organization=
2009
-
[75]
Adwords and generalized online matching , author=
-
[76]
ACM Computing Surveys (CSUR) , volume=
Reinforcement learning in healthcare: A survey , author=. ACM Computing Surveys (CSUR) , volume=. 2021 , publisher=
2021
-
[77]
Learning-based planning for improving science return of
Breitfeld, Abigail and Candela, Alberto and Delfa, Juan and Kangaslahti, Akseli and Zilberstein, Itai and Chien, Steve and Wettergreen, David , booktitle=. Learning-based planning for improving science return of
-
[78]
ACM Computing Surveys (CSUR) , volume=
Imitation learning: A survey of learning methods , author=. ACM Computing Surveys (CSUR) , volume=. 2017 , publisher=
2017
-
[79]
Robotics and Autonomous Systems , volume=
A survey of robot learning from demonstration , author=. Robotics and Autonomous Systems , volume=
-
[80]
The International Journal of Robotics Research , volume=
Imitation learning for agile autonomous driving , author=. The International Journal of Robotics Research , volume=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.