pith. sign in

arxiv: 2605.15633 · v1 · pith:OGVKLSE6new · submitted 2026-05-15 · 📊 stat.ME

Structured Transfer Learning for Survival Risk Stratification in Data-Sparse Clinical Cohorts

Pith reviewed 2026-05-20 16:55 UTC · model grok-4.3

classification 📊 stat.ME
keywords core-coxsurvivaltransfercohortsmodelsriskacrossclinical
0
0 comments X

The pith

CORE-Cox learns low-rank Cox coefficients across outcomes in a source cohort then applies regularized adaptation to a target cohort, yielding C-index gains from 0.733 to 0.766 in UK Biobank and 0.628 to 0.658 in MIMIC-IV Asian subgroups under nested cross-validation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Survival models predict the chance of events like death or disease progression over time. When a clinical group has very few patients or few events, these models become unstable and unreliable. Standard approaches either train only on the small group, which overfits, or borrow a model from a larger group, which may not fit well if risk factors differ. CORE-Cox tries to solve this by first analyzing a big source cohort to find common patterns across several related health outcomes. It uses a low-rank structure to keep the shared patterns simple and stable. Then it makes small adjustments to these patterns for the target cohort using regularization so the model does not overfit the limited data. The authors tested this on UK Biobank data, using White patients as the large source and Asian patients as the small target, and on MIMIC-IV ICU data with the same split. In both cases the new model gave higher discrimination scores than training on the target alone or simple transfer methods. The improvements appeared in most of the nine outcomes examined and also helped identify the highest-risk patients more effectively.

Core claim

CORE-Cox achieved best or near-best discrimination across most outcomes. Mean C-index improved from 0.733 to 0.766 in UK Biobank and from 0.628 to 0.658 in MIMIC-IV, with gains in eight of nine outcomes.

Load-bearing premise

The premise that shared risk-factor patterns across related outcomes exist in the source cohort and can be captured by a low-rank Cox coefficient structure that transfers usefully after regularized residual correction to the target cohort. This premise is stated in the methods description of the two-stage framework.

Figures

Figures reproduced from arXiv: 2605.15633 by Dennis Wang, Doudou Zhou, Hong Pan, Juan Delgado-SanMartin, Junhan Yu, Yurui Chen.

Figure 1
Figure 1. Figure 1: Schematic overview of CORE-Cox. Stage 1 learns shared cross-outcome Cox coe [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: C-index comparison across nine outcomes in the UK Biobank Asian cohort. Violins summarize repeated evaluations by method [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: C-index comparison across nine outcomes in the MIMIC-IV Asian ICU cohort. Violins summarize repeated evaluations by method [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Hazard-ratio estimates and 95% confidence intervals for selected NAFLD risk factors in the UK Biobank Asian cohort. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

Background: Survival prediction models are often less reliable in clinical groups with limited sample sizes or few outcome events. Target-only models may be unstable, whereas models from larger cohorts may transfer poorly when risk-factor effects differ across populations. We evaluated whether structured transfer learning can improve survival risk stratification in data-sparse cohorts while allowing cohort-specific adaptation. Methods: We developed the COhort-shared Rank-rEduced Cox model (CORE-Cox), a two-stage framework for multi-outcome survival prediction. CORE-Cox learns shared risk-factor patterns across related outcomes in a larger source cohort via a low-rank Cox coefficient structure, then adapts these patterns to a smaller target cohort through regularized residual correction. We evaluated CORE-Cox in UK Biobank (White source, n=150,093; Asian target, n=2,534) and MIMIC-IV (White ICU source, n=15,997; Asian ICU target, n=672), comparing against target-only Cox, penalized Cox, low-rank multi-task, naive pooling, direct transfer, and single-outcome residual transfer under repeated nested cross-validation. Results: CORE-Cox achieved best or near-best discrimination across most outcomes. Mean C-index improved from 0.733 to 0.766 in UK Biobank and from 0.628 to 0.658 in MIMIC-IV, with gains in eight of nine outcomes. CORE-Cox also improved top-15% risk enrichment, with hazard-ratio estimates typically intermediate between source-only and target-only models. Discussion: CORE-Cox offers an interpretable transfer-learning framework for survival risk stratification in data-sparse cohorts, combining shared cross-outcome structure with cohort-specific adaptation. Further validation is needed before use in calibrated absolute-risk prediction or clinical decision-making.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CORE-Cox, a two-stage transfer learning framework for multi-outcome survival prediction. A low-rank Cox model is first fit across related outcomes in a large source cohort to capture shared risk-factor structure; this is then adapted to a smaller target cohort via regularized residual correction. The approach is evaluated under repeated nested cross-validation on UK Biobank (White source n=150093 to Asian target n=2534) and MIMIC-IV (White ICU source n=15997 to Asian ICU target n=672), reporting mean C-index gains from 0.733 to 0.766 and 0.628 to 0.658 respectively, with improvements in eight of nine outcomes versus target-only Cox, penalized Cox, low-rank multi-task, naive pooling, direct transfer, and single-outcome residual transfer baselines.

Significance. If the reported discrimination gains prove robust, CORE-Cox would supply a useful, interpretable tool for survival risk stratification in data-sparse clinical subgroups by exploiting cross-outcome structure while permitting cohort-specific adaptation. The repeated nested CV design and broad baseline comparisons are strengths that support the empirical evaluation.

major comments (2)
  1. [Results] Results section (C-index tables/figures): mean C-index values are reported without standard deviations, confidence intervals, or paired statistical tests from the repeated nested CV. In target cohorts of size 2534 and especially 672, C-index variance is high; absence of these quantities leaves open whether the observed gains (0.033 and 0.03) are reliable or could reverse under new draws, directly affecting the central claim of consistent improvement across outcomes.
  2. [Methods] Methods (two-stage framework description): the nested CV procedure for jointly selecting rank r and the residual-correction regularization strength is not specified in sufficient detail to confirm that target-cohort information is not used in hyperparameter tuning, which is load-bearing for claims that the adaptation step improves upon direct transfer.
minor comments (2)
  1. [Abstract] Abstract: the nine outcomes are not enumerated; a short list would improve readability.
  2. [Methods] Notation: the precise form of the regularized residual correction objective (e.g., the penalty term and its weighting) could be written as an explicit equation for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas for strengthening the statistical reporting and methodological transparency in our manuscript on CORE-Cox. We address each major comment below and have incorporated revisions to enhance the rigor of the evaluation and clarity of the nested cross-validation procedure.

read point-by-point responses
  1. Referee: [Results] Results section (C-index tables/figures): mean C-index values are reported without standard deviations, confidence intervals, or paired statistical tests from the repeated nested CV. In target cohorts of size 2534 and especially 672, C-index variance is high; absence of these quantities leaves open whether the observed gains (0.033 and 0.03) are reliable or could reverse under new draws, directly affecting the central claim of consistent improvement across outcomes.

    Authors: We agree that measures of variability and formal statistical comparisons are essential, particularly given the modest target cohort sizes. In the revised manuscript, we will report standard deviations of the C-index across the repeated nested CV iterations for all methods and outcomes. We will also add paired Wilcoxon signed-rank tests (or equivalent) between CORE-Cox and each baseline to evaluate whether the observed mean improvements are statistically significant. These additions will directly address concerns about reliability of the reported gains. revision: yes

  2. Referee: [Methods] Methods (two-stage framework description): the nested CV procedure for jointly selecting rank r and the residual-correction regularization strength is not specified in sufficient detail to confirm that target-cohort information is not used in hyperparameter tuning, which is load-bearing for claims that the adaptation step improves upon direct transfer.

    Authors: We appreciate this observation on the need for explicit detail. The current implementation performs hyperparameter selection (rank r and residual regularization strength) exclusively within the inner training folds of the target cohort during nested CV, ensuring the outer test fold remains untouched. In the revision, we will expand the Methods section with a step-by-step description of the nested CV scheme, including how the joint grid search over r and lambda is conducted on training data only, and we will add a supplementary figure or pseudocode to illustrate the data partitioning and tuning process. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical evaluation on external held-out cohorts

full rationale

The CORE-Cox method is defined as a two-stage procedure that first fits a low-rank Cox structure on the source cohort and then applies regularized residual correction on the target cohort. This construction is not self-referential; the reported performance gains are measured via repeated nested cross-validation on independent target data (UK Biobank Asian n=2534 and MIMIC-IV Asian n=672) against multiple baselines, with no step in which a fitted parameter is renamed as a prediction or where the central claim reduces to a self-citation chain. The framework remains falsifiable against external benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work as load-bearing justification.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard Cox proportional-hazards assumptions plus the key modeling choice that low-rank structure captures transferable shared risk patterns; several regularization and rank hyperparameters are introduced and must be selected or tuned.

free parameters (2)
  • rank r
    Dimension of the low-rank approximation for shared Cox coefficients across outcomes; value is chosen or cross-validated.
  • regularization strength for residual correction
    Controls the amount of cohort-specific adaptation; value is tuned during training.
axioms (2)
  • domain assumption Proportional hazards assumption holds within each cohort
    Standard background assumption for all Cox-based survival models invoked throughout the framework.
  • domain assumption Shared low-rank risk patterns exist across outcomes and are transferable after residual correction
    Load-bearing premise stated in the description of the two-stage CORE-Cox method.

pith-pipeline@v0.9.0 · 5868 in / 1598 out tokens · 139103 ms · 2026-05-20T16:55:22.758208+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages

  1. [1]

    Chari and L

    C. Sudlow, J. Gallacher, N. Allen, V . Beral, P. Burton, J. Danesh, P. Downey, P. Elliott, J. Green, M. Lan- dray, B. Liu, P. Matthews, G. Ong, J. Pell, A. Silman, A. Young, T. Sprosen, T. Peakman, R. Collins, Uk biobank: An open access resource for identifying the causes of a wide range of complex diseases of mid- dle and old age, PLOS Medicine 12 (2015)...

  2. [2]

    Johnson, L

    A. Johnson, L. Bulgarelli, T. Pollard, B. Gow, B. Moody, S. Horng, L. A. Celi, R. Mark, Mimic-iv, 2024. URL:https://physionet.org/content/mimiciv/3.0/. doi:10.13026/HXP0-HG59

  3. [3]

    Nagai, M

    A. Nagai, M. Hirata, Y . Kamatani, K. Muto, K. Matsuda, Y . Kiyohara, T. Ninomiya, A. Tamakoshi, Z. Ya- magata, T. Mushiroda, Y . Murakami, K. Yuji, Y . Furukawa, H. Zembutsu, T. Tanaka, Y . Ohnishi, Y . Naka- mura, BioBank Japan Cooperative Hospital Group, M. Kubo, Overview of the BioBank japan project: Study design and profile, J. Epidemiol. 27 (2017) S2–S8

  4. [4]

    D. R. Cox, Regression models and life-tables, Journal of the Royal Statistical Society Series B: Sta- tistical Methodology 34 (1972) 187–202. URL:http://dx.doi.org/10.1111/j.2517-6161.1972. tb00899.x. doi:10.1111/j.2517-6161.1972.tb00899.x

  5. [5]

    Science , author =

    Z. Obermeyer, B. Powers, C. V ogeli, S. Mullainathan, Dissecting racial bias in an algorithm used to manage the health of populations, Science 366 (2019) 447–453. URL:http://dx.doi.org/10.1126/ science.aax2342. doi:10.1126/science.aax2342

  6. [6]

    D. A. Vyas, L. G. Eisenstein, D. S. Jones, Hidden in plain sight — reconsidering the use of race correction in clinical algorithms, New England Journal of Medicine 383 (2020) 874–882. URL:http://dx.doi. org/10.1056/NEJMms2004740. doi:10.1056/nejmms2004740

  7. [7]

    Jeong, H

    S. Jeong, H. Namkoong, Robust causal inference under covariate shift via worst-case subpopulation treat- ment effects, in: J. Abernethy, S. Agarwal (Eds.), Proceedings of Thirty Third Conference on Learning Theory, volume 125 ofProceedings of Machine Learning Research, PMLR, 2020, pp. 2079–2084. URL: https://proceedings.mlr.press/v125/jeong20a.html

  8. [8]

    H. Qiu, E. Dobriban, E. Tchetgen Tchetgen, Prediction sets adaptive to unknown covariate shift, Journal of the Royal Statistical Society Series B: Statistical Methodology 85 (2023) 1680–1705. URL:http: //dx.doi.org/10.1093/jrsssb/qkad069. doi:10.1093/jrsssb/qkad069

  9. [9]

    D. E. James, J. Stöckli, M. J. Birnbaum, The aetiology and molecular landscape of insulin resistance, Nature Reviews Molecular Cell Biology 22 (2021) 751–771. URL:http://dx.doi.org/10.1038/ s41580-021-00390-6. doi:10.1038/s41580-021-00390-6

  10. [10]

    W. Zhao, A. Rasheed, E. Tikkanen, J.-J. Lee, A. S. Butterworth, J. M. M. Howson, T. L. Assimes, R. Chowdhury, M. Orho-Melander, S. Damrauer, A. Small, S. Asma, M. Imamura, T. Yamauch, J. C. Chambers, P. Chen, B. R. Sapkota, N. Shah, S. Jabeen, P. Surendran, Y . Lu, W. Zhang, A. Imran, S. Ab- bas, F. Majeed, K. Trindade, N. Qamar, N. H. Mallick, Z. Yaqoob,...

  11. [11]

    X. Lu, Q. Xie, X. Pan, R. Zhang, X. Zhang, G. Peng, Y . Zhang, S. Shen, N. Tong, Type 2 diabetes mellitus in adults: pathogenesis, prevention and therapy, Signal Transduction and Targeted Therapy 9 (2024). URL: http://dx.doi.org/10.1038/s41392-024-01951-9. doi:10.1038/s41392-024-01951-9

  12. [12]

    Furman, J

    D. Furman, J. Campisi, E. Verdin, P. Carrera-Bastos, S. Targ, C. Franceschi, L. Ferrucci, D. W. Gilroy, A. Fasano, G. W. Miller, A. H. Miller, A. Mantovani, C. M. Weyand, N. Barzilai, J. J. Goronzy, T. A. Rando, R. B. Effros, A. Lucia, N. Kleinstreuer, G. M. Slavich, Chronic inflammation in the etiology of disease across the life span, Nature Medicine 25 ...

  13. [13]

    the epidemic

    J. M. Bennett, G. Reeves, G. E. Billman, J. P. Sturmberg, Inflammation–nature’s way to efficiently re- spond to all types of challenges: Implications for understanding and managing “the epidemic” of chronic diseases, Frontiers in Medicine 5 (2018). URL:http://dx.doi.org/10.3389/fmed.2018.00316. doi:10.3389/fmed.2018.00316

  14. [14]

    Del Giudice, S

    M. Del Giudice, S. W. Gangestad, Rethinking il-6 and crp: Why they are more than inflamma- tory biomarkers, and why it matters, Brain, Behavior, and Immunity 70 (2018) 61–75. URL:http: //dx.doi.org/10.1016/j.bbi.2018.02.013. doi:10.1016/j.bbi.2018.02.013

  15. [15]

    Caruana, Multitask learning, Mach

    R. Caruana, Multitask learning, Mach. Learn. 28 (1997) 41–75

  16. [16]

    M. Yuan, A. Ekici, Z. Lu, R. Monteiro, Dimension reduction and coefficient estimation in multivariate linear regression, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69 (2007) 329–346. URL:http://www.jstor.org/stable/4623272

  17. [17]

    Zhang, F

    S. Zhang, F. Yang, L. Wang, S. Si, J. Zhang, F. Xue, Personalized prediction for multiple chronic diseases by developing the multi-task cox learning model, PLOS Computational Biology 19 (2023) e1011396. URL:http://dx.doi.org/10.1371/journal.pcbi.1011396. doi:10.1371/journal. pcbi.1011396

  18. [18]

    Y . Li, L. Wang, J. Wang, J. Ye, C. K. Reddy, Transfer learning for survival analysis via efficient l2, 1-norm regularized cox regression, in: 2016 IEEE 16th International Conference on Data Mining (ICDM), IEEE, 2016, p. 231–240. URL:http://dx.doi.org/10.1109/ICDM.2016.0034. doi:10.1109/icdm.2016. 0034

  19. [19]

    Z. Li, Y . Shen, J. Ning, Accommodating time-varying heterogeneity in risk estimation under the cox model: A transfer learning approach, Journal of the American Statistical Association 118 (2023) 2276–

  20. [20]

    \ Rousseeuw, P J

    URL:https://doi.org/10.1080/01621459.2023.2210336. doi:10.1080/01621459.2023. 2210336.arXiv:https://doi.org/10.1080/01621459.2023.2210336, pMID: 38505403

  21. [21]

    G. S. Collins, K. G. M. Moons, P. Dhiman, R. D. Riley, A. L. Beam, B. Van Calster, M. Ghassemi, X. Liu, J. B. Reitsma, M. van Smeden, A.-L. Boulesteix, J. Camaradou, L. A. Celi, S. Denaxas, A. K. Denniston, B. Glocker, R. M. Golub, H. Harvey, G. Heinze, M. M. Hoffman, A. P. Kengne, C. Lam, M. van der Schaar, S. J. V ollmer, J. Wilkinson, L. Yang, M. A. Lo...

  22. [22]

    von Elm, D

    E. von Elm, D. G. Altman, M. Egger, S. J. Pocock, P. C. Gøtzsche, J. P. Vandenbroucke, The strength- ening the reporting of observational studies in epidemiology (strobe) statement: guidelines for reporting observational studies, PLOS Medicine 4 (2007) e296. doi:10.1371/journal.pmed.0040296

  23. [23]

    J. C. Maro, R. Platt, J. H. Holmes, B. L. Strom, S. Hennessy, R. Lazarus, J. S. Brown, Design of a national distributed health data network, Annals of Internal Medicine 151 (2009) 341–344. URL:http://dx.doi.org/10.7326/0003-4819-151-5-200909010-00139. doi:10.7326/0003-4819-151-5-200909010-00139

  24. [24]

    J. S. Brown, J. C. Maro, M. Nguyen, R. Ball, Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the food and drug administration’s sentinel system, Journal of the American Medical Informatics Association 27 (2020) 793–797. URL:http://dx.doi. org/10.1093/jamia/ocaa028. doi:10.1093/jamia/ocaa028. 10