Structured Transfer Learning for Survival Risk Stratification in Data-Sparse Clinical Cohorts
Pith reviewed 2026-05-20 16:55 UTC · model grok-4.3
The pith
CORE-Cox learns low-rank Cox coefficients across outcomes in a source cohort then applies regularized adaptation to a target cohort, yielding C-index gains from 0.733 to 0.766 in UK Biobank and 0.628 to 0.658 in MIMIC-IV Asian subgroups under nested cross-validation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CORE-Cox achieved best or near-best discrimination across most outcomes. Mean C-index improved from 0.733 to 0.766 in UK Biobank and from 0.628 to 0.658 in MIMIC-IV, with gains in eight of nine outcomes.
Load-bearing premise
The premise that shared risk-factor patterns across related outcomes exist in the source cohort and can be captured by a low-rank Cox coefficient structure that transfers usefully after regularized residual correction to the target cohort. This premise is stated in the methods description of the two-stage framework.
Figures
read the original abstract
Background: Survival prediction models are often less reliable in clinical groups with limited sample sizes or few outcome events. Target-only models may be unstable, whereas models from larger cohorts may transfer poorly when risk-factor effects differ across populations. We evaluated whether structured transfer learning can improve survival risk stratification in data-sparse cohorts while allowing cohort-specific adaptation. Methods: We developed the COhort-shared Rank-rEduced Cox model (CORE-Cox), a two-stage framework for multi-outcome survival prediction. CORE-Cox learns shared risk-factor patterns across related outcomes in a larger source cohort via a low-rank Cox coefficient structure, then adapts these patterns to a smaller target cohort through regularized residual correction. We evaluated CORE-Cox in UK Biobank (White source, n=150,093; Asian target, n=2,534) and MIMIC-IV (White ICU source, n=15,997; Asian ICU target, n=672), comparing against target-only Cox, penalized Cox, low-rank multi-task, naive pooling, direct transfer, and single-outcome residual transfer under repeated nested cross-validation. Results: CORE-Cox achieved best or near-best discrimination across most outcomes. Mean C-index improved from 0.733 to 0.766 in UK Biobank and from 0.628 to 0.658 in MIMIC-IV, with gains in eight of nine outcomes. CORE-Cox also improved top-15% risk enrichment, with hazard-ratio estimates typically intermediate between source-only and target-only models. Discussion: CORE-Cox offers an interpretable transfer-learning framework for survival risk stratification in data-sparse cohorts, combining shared cross-outcome structure with cohort-specific adaptation. Further validation is needed before use in calibrated absolute-risk prediction or clinical decision-making.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces CORE-Cox, a two-stage transfer learning framework for multi-outcome survival prediction. A low-rank Cox model is first fit across related outcomes in a large source cohort to capture shared risk-factor structure; this is then adapted to a smaller target cohort via regularized residual correction. The approach is evaluated under repeated nested cross-validation on UK Biobank (White source n=150093 to Asian target n=2534) and MIMIC-IV (White ICU source n=15997 to Asian ICU target n=672), reporting mean C-index gains from 0.733 to 0.766 and 0.628 to 0.658 respectively, with improvements in eight of nine outcomes versus target-only Cox, penalized Cox, low-rank multi-task, naive pooling, direct transfer, and single-outcome residual transfer baselines.
Significance. If the reported discrimination gains prove robust, CORE-Cox would supply a useful, interpretable tool for survival risk stratification in data-sparse clinical subgroups by exploiting cross-outcome structure while permitting cohort-specific adaptation. The repeated nested CV design and broad baseline comparisons are strengths that support the empirical evaluation.
major comments (2)
- [Results] Results section (C-index tables/figures): mean C-index values are reported without standard deviations, confidence intervals, or paired statistical tests from the repeated nested CV. In target cohorts of size 2534 and especially 672, C-index variance is high; absence of these quantities leaves open whether the observed gains (0.033 and 0.03) are reliable or could reverse under new draws, directly affecting the central claim of consistent improvement across outcomes.
- [Methods] Methods (two-stage framework description): the nested CV procedure for jointly selecting rank r and the residual-correction regularization strength is not specified in sufficient detail to confirm that target-cohort information is not used in hyperparameter tuning, which is load-bearing for claims that the adaptation step improves upon direct transfer.
minor comments (2)
- [Abstract] Abstract: the nine outcomes are not enumerated; a short list would improve readability.
- [Methods] Notation: the precise form of the regularized residual correction objective (e.g., the penalty term and its weighting) could be written as an explicit equation for clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which identify key areas for strengthening the statistical reporting and methodological transparency in our manuscript on CORE-Cox. We address each major comment below and have incorporated revisions to enhance the rigor of the evaluation and clarity of the nested cross-validation procedure.
read point-by-point responses
-
Referee: [Results] Results section (C-index tables/figures): mean C-index values are reported without standard deviations, confidence intervals, or paired statistical tests from the repeated nested CV. In target cohorts of size 2534 and especially 672, C-index variance is high; absence of these quantities leaves open whether the observed gains (0.033 and 0.03) are reliable or could reverse under new draws, directly affecting the central claim of consistent improvement across outcomes.
Authors: We agree that measures of variability and formal statistical comparisons are essential, particularly given the modest target cohort sizes. In the revised manuscript, we will report standard deviations of the C-index across the repeated nested CV iterations for all methods and outcomes. We will also add paired Wilcoxon signed-rank tests (or equivalent) between CORE-Cox and each baseline to evaluate whether the observed mean improvements are statistically significant. These additions will directly address concerns about reliability of the reported gains. revision: yes
-
Referee: [Methods] Methods (two-stage framework description): the nested CV procedure for jointly selecting rank r and the residual-correction regularization strength is not specified in sufficient detail to confirm that target-cohort information is not used in hyperparameter tuning, which is load-bearing for claims that the adaptation step improves upon direct transfer.
Authors: We appreciate this observation on the need for explicit detail. The current implementation performs hyperparameter selection (rank r and residual regularization strength) exclusively within the inner training folds of the target cohort during nested CV, ensuring the outer test fold remains untouched. In the revision, we will expand the Methods section with a step-by-step description of the nested CV scheme, including how the joint grid search over r and lambda is conducted on training data only, and we will add a supplementary figure or pseudocode to illustrate the data partitioning and tuning process. revision: yes
Circularity Check
No circularity: empirical evaluation on external held-out cohorts
full rationale
The CORE-Cox method is defined as a two-stage procedure that first fits a low-rank Cox structure on the source cohort and then applies regularized residual correction on the target cohort. This construction is not self-referential; the reported performance gains are measured via repeated nested cross-validation on independent target data (UK Biobank Asian n=2534 and MIMIC-IV Asian n=672) against multiple baselines, with no step in which a fitted parameter is renamed as a prediction or where the central claim reduces to a self-citation chain. The framework remains falsifiable against external benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work as load-bearing justification.
Axiom & Free-Parameter Ledger
free parameters (2)
- rank r
- regularization strength for residual correction
axioms (2)
- domain assumption Proportional hazards assumption holds within each cohort
- domain assumption Shared low-rank risk patterns exist across outcomes and are transferable after residual correction
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CORE-Cox first learns shared risk-factor patterns across related outcomes in a larger source cohort using a low-rank Cox coefficient structure, and then adapts these patterns to a smaller cohort of interest through a regularized residual correction.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
B(s) = U(s)(V(s))⊤ ... B(t) = B̂(s) + Θ ... penalized Cox partial-likelihood objective
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
C. Sudlow, J. Gallacher, N. Allen, V . Beral, P. Burton, J. Danesh, P. Downey, P. Elliott, J. Green, M. Lan- dray, B. Liu, P. Matthews, G. Ong, J. Pell, A. Silman, A. Young, T. Sprosen, T. Peakman, R. Collins, Uk biobank: An open access resource for identifying the causes of a wide range of complex diseases of mid- dle and old age, PLOS Medicine 12 (2015)...
-
[2]
A. Johnson, L. Bulgarelli, T. Pollard, B. Gow, B. Moody, S. Horng, L. A. Celi, R. Mark, Mimic-iv, 2024. URL:https://physionet.org/content/mimiciv/3.0/. doi:10.13026/HXP0-HG59
-
[3]
A. Nagai, M. Hirata, Y . Kamatani, K. Muto, K. Matsuda, Y . Kiyohara, T. Ninomiya, A. Tamakoshi, Z. Ya- magata, T. Mushiroda, Y . Murakami, K. Yuji, Y . Furukawa, H. Zembutsu, T. Tanaka, Y . Ohnishi, Y . Naka- mura, BioBank Japan Cooperative Hospital Group, M. Kubo, Overview of the BioBank japan project: Study design and profile, J. Epidemiol. 27 (2017) S2–S8
work page 2017
-
[4]
D. R. Cox, Regression models and life-tables, Journal of the Royal Statistical Society Series B: Sta- tistical Methodology 34 (1972) 187–202. URL:http://dx.doi.org/10.1111/j.2517-6161.1972. tb00899.x. doi:10.1111/j.2517-6161.1972.tb00899.x
-
[5]
Z. Obermeyer, B. Powers, C. V ogeli, S. Mullainathan, Dissecting racial bias in an algorithm used to manage the health of populations, Science 366 (2019) 447–453. URL:http://dx.doi.org/10.1126/ science.aax2342. doi:10.1126/science.aax2342
-
[6]
D. A. Vyas, L. G. Eisenstein, D. S. Jones, Hidden in plain sight — reconsidering the use of race correction in clinical algorithms, New England Journal of Medicine 383 (2020) 874–882. URL:http://dx.doi. org/10.1056/NEJMms2004740. doi:10.1056/nejmms2004740
-
[7]
S. Jeong, H. Namkoong, Robust causal inference under covariate shift via worst-case subpopulation treat- ment effects, in: J. Abernethy, S. Agarwal (Eds.), Proceedings of Thirty Third Conference on Learning Theory, volume 125 ofProceedings of Machine Learning Research, PMLR, 2020, pp. 2079–2084. URL: https://proceedings.mlr.press/v125/jeong20a.html
work page 2020
-
[8]
H. Qiu, E. Dobriban, E. Tchetgen Tchetgen, Prediction sets adaptive to unknown covariate shift, Journal of the Royal Statistical Society Series B: Statistical Methodology 85 (2023) 1680–1705. URL:http: //dx.doi.org/10.1093/jrsssb/qkad069. doi:10.1093/jrsssb/qkad069
-
[9]
D. E. James, J. Stöckli, M. J. Birnbaum, The aetiology and molecular landscape of insulin resistance, Nature Reviews Molecular Cell Biology 22 (2021) 751–771. URL:http://dx.doi.org/10.1038/ s41580-021-00390-6. doi:10.1038/s41580-021-00390-6
-
[10]
W. Zhao, A. Rasheed, E. Tikkanen, J.-J. Lee, A. S. Butterworth, J. M. M. Howson, T. L. Assimes, R. Chowdhury, M. Orho-Melander, S. Damrauer, A. Small, S. Asma, M. Imamura, T. Yamauch, J. C. Chambers, P. Chen, B. R. Sapkota, N. Shah, S. Jabeen, P. Surendran, Y . Lu, W. Zhang, A. Imran, S. Ab- bas, F. Majeed, K. Trindade, N. Qamar, N. H. Mallick, Z. Yaqoob,...
-
[11]
X. Lu, Q. Xie, X. Pan, R. Zhang, X. Zhang, G. Peng, Y . Zhang, S. Shen, N. Tong, Type 2 diabetes mellitus in adults: pathogenesis, prevention and therapy, Signal Transduction and Targeted Therapy 9 (2024). URL: http://dx.doi.org/10.1038/s41392-024-01951-9. doi:10.1038/s41392-024-01951-9
-
[12]
D. Furman, J. Campisi, E. Verdin, P. Carrera-Bastos, S. Targ, C. Franceschi, L. Ferrucci, D. W. Gilroy, A. Fasano, G. W. Miller, A. H. Miller, A. Mantovani, C. M. Weyand, N. Barzilai, J. J. Goronzy, T. A. Rando, R. B. Effros, A. Lucia, N. Kleinstreuer, G. M. Slavich, Chronic inflammation in the etiology of disease across the life span, Nature Medicine 25 ...
-
[13]
J. M. Bennett, G. Reeves, G. E. Billman, J. P. Sturmberg, Inflammation–nature’s way to efficiently re- spond to all types of challenges: Implications for understanding and managing “the epidemic” of chronic diseases, Frontiers in Medicine 5 (2018). URL:http://dx.doi.org/10.3389/fmed.2018.00316. doi:10.3389/fmed.2018.00316
-
[14]
M. Del Giudice, S. W. Gangestad, Rethinking il-6 and crp: Why they are more than inflamma- tory biomarkers, and why it matters, Brain, Behavior, and Immunity 70 (2018) 61–75. URL:http: //dx.doi.org/10.1016/j.bbi.2018.02.013. doi:10.1016/j.bbi.2018.02.013
-
[15]
Caruana, Multitask learning, Mach
R. Caruana, Multitask learning, Mach. Learn. 28 (1997) 41–75
work page 1997
- [16]
-
[17]
S. Zhang, F. Yang, L. Wang, S. Si, J. Zhang, F. Xue, Personalized prediction for multiple chronic diseases by developing the multi-task cox learning model, PLOS Computational Biology 19 (2023) e1011396. URL:http://dx.doi.org/10.1371/journal.pcbi.1011396. doi:10.1371/journal. pcbi.1011396
-
[18]
Y . Li, L. Wang, J. Wang, J. Ye, C. K. Reddy, Transfer learning for survival analysis via efficient l2, 1-norm regularized cox regression, in: 2016 IEEE 16th International Conference on Data Mining (ICDM), IEEE, 2016, p. 231–240. URL:http://dx.doi.org/10.1109/ICDM.2016.0034. doi:10.1109/icdm.2016. 0034
-
[19]
Z. Li, Y . Shen, J. Ning, Accommodating time-varying heterogeneity in risk estimation under the cox model: A transfer learning approach, Journal of the American Statistical Association 118 (2023) 2276–
work page 2023
-
[20]
URL:https://doi.org/10.1080/01621459.2023.2210336. doi:10.1080/01621459.2023. 2210336.arXiv:https://doi.org/10.1080/01621459.2023.2210336, pMID: 38505403
-
[21]
G. S. Collins, K. G. M. Moons, P. Dhiman, R. D. Riley, A. L. Beam, B. Van Calster, M. Ghassemi, X. Liu, J. B. Reitsma, M. van Smeden, A.-L. Boulesteix, J. Camaradou, L. A. Celi, S. Denaxas, A. K. Denniston, B. Glocker, R. M. Golub, H. Harvey, G. Heinze, M. M. Hoffman, A. P. Kengne, C. Lam, M. van der Schaar, S. J. V ollmer, J. Wilkinson, L. Yang, M. A. Lo...
-
[22]
E. von Elm, D. G. Altman, M. Egger, S. J. Pocock, P. C. Gøtzsche, J. P. Vandenbroucke, The strength- ening the reporting of observational studies in epidemiology (strobe) statement: guidelines for reporting observational studies, PLOS Medicine 4 (2007) e296. doi:10.1371/journal.pmed.0040296
-
[23]
J. C. Maro, R. Platt, J. H. Holmes, B. L. Strom, S. Hennessy, R. Lazarus, J. S. Brown, Design of a national distributed health data network, Annals of Internal Medicine 151 (2009) 341–344. URL:http://dx.doi.org/10.7326/0003-4819-151-5-200909010-00139. doi:10.7326/0003-4819-151-5-200909010-00139
-
[24]
J. S. Brown, J. C. Maro, M. Nguyen, R. Ball, Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the food and drug administration’s sentinel system, Journal of the American Medical Informatics Association 27 (2020) 793–797. URL:http://dx.doi. org/10.1093/jamia/ocaa028. doi:10.1093/jamia/ocaa028. 10
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.