Transfer Learning and Machine Learning for Training Five Year Survival Prognostic Models in Early Breast Cancer
Pith reviewed 2026-05-18 11:59 UTC · model grok-4.3
The pith
Transfer learning from PREDICT v3 and random survival forests improve five-year survival predictions in early breast cancer when data is missing or shifted.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Transfer learning by fine-tuning PREDICT v3, de-novo random survival forests, and weighted ensemble integration reduce the integrated calibration index from 0.042 in the original PREDICT v3 to at most 0.007 on MA.27 data while keeping discrimination comparable or slightly higher, and these models can produce predictions for every patient even when 23.8 to 25.8 percent of cases lack the inputs PREDICT v3 requires.
What carries the argument
Fine-tuning the pre-trained PREDICT v3 tool on MA.27 trial data combined with training random survival forests and extreme gradient boosting models, then integrating predictions through a weighted sum.
If this is right
- All patients receive a survival estimate even when information required by PREDICT v3 is absent.
- Age, nodal status, tumor grade, and tumor size remain the strongest predictors across the new models.
- Calibration gains appear in SEER validation but are not confirmed in the TEAM trial.
- The methods are positioned for use when a dataset shift between training and deployment populations is expected.
Where Pith is reading between the lines
- These models could be embedded in electronic health record systems to deliver real-time estimates without requiring complete data entry.
- Prospective studies could test whether the improved calibration changes actual treatment decisions or patient outcomes.
- Adding genomic variables to the ensemble when they become available might produce further gains in settings that already collect them.
Load-bearing premise
The MA.27 trial patients are similar enough to those in the TEAM and SEER validation groups that performance gains will appear in new settings.
What would settle it
A new external dataset with comparable rates of missing PREDICT inputs where the new models show no improvement in calibration index or discrimination over the original PREDICT v3 would undermine the central claim.
read the original abstract
Prognostic information is essential for decision-making in breast cancer management. Recently trials have predominantly focused on genomic prognostication tools, even though clinicopathological prognostication is less costly and more widely accessible. Machine learning (ML), transfer learning and ensemble integration offer opportunities to build robust prognostication frameworks. We evaluate this potential to improve survival prognostication in breast cancer by comparing de-novo ML, transfer learning from a pre-trained prognostic tool and ensemble integration. Data from the MA.27 trial was used for model training, with external validation on the TEAM trial and a SEER cohort. Transfer learning was applied by fine-tuning the pre-trained prognostic tool PREDICT v3, de-novo ML included Random Survival Forests and Extreme Gradient Boosting, and ensemble integration was realized through a weighted sum of model predictions. Transfer learning, de-novo RSF, and ensemble integration improved calibration in MA.27 over the pre-trained model (ICI reduced from 0.042 in PREDICT v3 to <=0.007) while discrimination remained comparable (AUC increased from 0.738 in PREDICT v3 to 0.744-0.799). Invalid PREDICT v3 predictions were observed in 23.8-25.8% of MA.27 individuals due to missing information. In contrast, ML models and ensemble integration could predict survival regardless of missing information. Across all models, patient age, nodal status, pathological grading and tumor size had the highest SHAP values, indicating their importance for survival prognostication. External validation in SEER, but not in TEAM, confirmed the benefits of transfer learning, RSF and ensemble integration. This study demonstrates that transfer learning, de-novo RSF, and ensemble integration can improve prognostication in situations where relevant information for PREDICT v3 is lacking or where a dataset shift is likely.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript evaluates transfer learning from PREDICT v3, de-novo Random Survival Forests (RSF) and XGBoost models, and ensemble integration for five-year survival prediction in early breast cancer. Using MA.27 trial data for training, the approaches demonstrate reduced integrated calibration index (ICI from 0.042 to ≤0.007) and comparable or improved AUC (0.738 to 0.744-0.799) compared to PREDICT v3, while handling missing data that invalidates 23.8-25.8% of PREDICT predictions. External validation on TEAM and SEER cohorts shows benefits in SEER but not TEAM, supporting the claim that these methods improve prognostication under missing information or dataset shifts.
Significance. If the findings hold, this study offers a practical advancement in accessible clinicopathological prognostication for breast cancer, addressing limitations of tools like PREDICT v3 in real-world data with missing values or shifts. The empirical validation across independent cohorts (MA.27 training, TEAM and SEER validation) and concrete metrics provide a solid foundation for potential clinical utility. The handling of missing data and feature importance via SHAP add value.
major comments (2)
- [External validation] External validation section: The reported lack of benefit for transfer learning, RSF, and ensemble methods in the TEAM cohort (while present in SEER) directly undermines the central claim that these approaches reliably improve prognostication under dataset shift; the manuscript must include a quantitative comparison of covariate distributions, treatment patterns, and missingness mechanisms across MA.27, TEAM, and SEER to explain the discrepancy.
- [Methods] Methods section: Insufficient detail is provided on hyperparameter tuning for the RSF and XGBoost models, the cross-validation procedure used during training on MA.27, and the determination of ensemble weights; without these, it is difficult to assess whether the observed ICI reductions are robust or sensitive to implementation choices.
minor comments (2)
- [Abstract] Abstract: The abstract mentions AUC gains but does not specify the statistical tests or confidence intervals used to compare discrimination across models.
- [Results] Results: The SHAP analysis identifies age, nodal status, grading, and tumor size as top features; clarify whether these rankings are consistent across all models or vary by method.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive review of our manuscript. We address each of the major comments below and have revised the manuscript accordingly to improve its clarity and completeness.
read point-by-point responses
-
Referee: [External validation] External validation section: The reported lack of benefit for transfer learning, RSF, and ensemble methods in the TEAM cohort (while present in SEER) directly undermines the central claim that these approaches reliably improve prognostication under dataset shift; the manuscript must include a quantitative comparison of covariate distributions, treatment patterns, and missingness mechanisms across MA.27, TEAM, and SEER to explain the discrepancy.
Authors: We thank the referee for highlighting this important point. The discrepancy between TEAM and SEER validation results does not undermine our central claim but rather illustrates the context-dependent nature of improvements under dataset shift. MA.27 and TEAM are both large randomized clinical trials with similar patient populations, treatment standards, and data collection protocols, leading to minimal dataset shift and thus limited additional benefit from the ML approaches. In contrast, SEER is a population-based registry with greater heterogeneity, missing data patterns, and potential shifts in demographics and treatments, where the benefits are more pronounced. To address the request, we will add a new supplementary table (or section) providing quantitative comparisons of key covariate distributions (e.g., means and proportions for age, tumor size, nodal status, grade), treatment patterns (e.g., chemotherapy and endocrine therapy rates), and missingness rates across the three cohorts. This will be accompanied by statistical tests for differences where appropriate. We believe this addition will strengthen the interpretation of the external validation results. revision: yes
-
Referee: [Methods] Methods section: Insufficient detail is provided on hyperparameter tuning for the RSF and XGBoost models, the cross-validation procedure used during training on MA.27, and the determination of ensemble weights; without these, it is difficult to assess whether the observed ICI reductions are robust or sensitive to implementation choices.
Authors: We agree with the referee that greater methodological transparency is essential. In the revised Methods section, we will provide full details on the hyperparameter tuning process, including the specific ranges or grids explored for key parameters in Random Survival Forests (e.g., number of trees, mtry, node size) and XGBoost (e.g., learning rate, max depth, subsample). We will describe the cross-validation procedure, specifying that we employed 5-fold cross-validation on the MA.27 training data to select optimal hyperparameters via grid search or random search, with the integrated calibration index (ICI) or concordance index as the optimization metric. For the ensemble, we will explain that weights were determined by optimizing a weighted combination on a held-out validation subset of MA.27 to minimize the ICI, with the final weights reported. Additionally, we will include a sensitivity analysis demonstrating that the reported ICI improvements remain consistent across reasonable variations in these choices. These revisions will allow readers to better evaluate the robustness of our findings. revision: yes
Circularity Check
No significant circularity: empirical training on MA.27 with external validation on independent TEAM and SEER cohorts
full rationale
The paper performs standard supervised ML training (RSF, XGBoost, transfer learning from external PREDICT v3, ensemble) on the MA.27 trial dataset and evaluates calibration (ICI) and discrimination (AUC) on held-out external cohorts (TEAM, SEER). No equations are presented that define a quantity in terms of itself or rename a fitted parameter as a prediction. No load-bearing self-citation chain or uniqueness theorem from the same authors is invoked to justify model choices. Performance improvements are measured against an external pre-trained tool and independent validation sets rather than reducing to quantities defined solely by the paper's own fitted values. This is a self-contained empirical comparison against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- Hyperparameters for RSF and XGBoost models
- Ensemble weights
axioms (2)
- domain assumption Standard right-censoring assumptions hold for the survival data in MA.27, TEAM, and SEER cohorts
- domain assumption PREDICT v3 provides a suitable pre-trained base model for transfer learning in this domain
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Transfer learning, de-novo RSF, and ensemble integration improved calibration in MA.27 over the pre-trained model (ICI reduced from 0.042 in PREDICT v3 to ≤0.007)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
External validation in SEER, but not in TEAM, confirmed the benefits
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Discussion 4.1. Summary and Comparison to Literature This study investigated how innovative learning approaches, including pre-trained models combined with transfer learning, de novo ML (RSF, XGB) and ensemble integration, can be leveraged to enhance the performance of prognostication tools for breast cancer . Using the MA.27 dataset, we addressed four qu...
work page 2010
-
[2]
In their study, the AUC for 5 -year survival was 0.797 in the hormone positive sub-cohort
to validate PREDICT v3 on US patients. In their study, the AUC for 5 -year survival was 0.797 in the hormone positive sub-cohort. Calibration was more difficult to compare directly between these studies due to differences in calibration metrics and presentation. Visual comparison of calibration plots in the hormone positive sub -cohort, however, suggest t...
work page 2004
-
[3]
Ethics This project has been approved by the Ottawa Health Science Network Research Ethics Board (protocol ID 20210803 -01H) and by the Children’s Hospital of Eastern Ontario Research Ethics Board (protocol 25/107X)
-
[4]
We also would like to thank Daniel W
Acknowledgements The authors would like to thank Paul Pharoah (Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, USA ) for the kind provision of the PREDICT v3 algorithm and the helpful comments on its use. We also would like to thank Daniel W. Rea (Cancer Research UK Clinical Trials Unit, University of Birmingham, United ...
-
[5]
Author Contributions Conceptualization, design, and analysis: LP, GRP, KY, LJ, FKD, MC, KEE; data collection and acquisition: ABB, LV, JH, MS, AL, LS, BEC, JMSB, KJT, JB, SLB, MS, CJHV, EMKK, LD, EM, AH, CM, MC; drafting manuscript: LP, KEE; review and editing: all authors
-
[6]
Funding Statement LP is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 530282197. KEE is funded by the Canada Research Chairs program through the Canadian Institutes of Health Research, and a Discovery Grant RGPIN-2022-04811 from the Natural Sciences and Engineering Research Council of Canada. The Ontario Institute for C...
work page 2022
-
[7]
Competing Interests Statement KEE was the scholar-in-residence at the Office of the Information and Privacy Commissioner of Ontario at the time of conducting this study
-
[8]
Access to SEER can be requested under www.seer.cancer.gov
Data Availability We used confidential healthcare data (MA.27 and TEAM) as well as accessible data from SEER for this study. Access to SEER can be requested under www.seer.cancer.gov
-
[9]
References 1 WHO: Breast cancer. https://www.who.int/news-room/fact-sheets/detail/breast-cancer (accessed 7 April 2025) 2 Benitez Fuentes JD, Morgan E, de Luna Aguilar A, et al. Global Stage Distribution of Breast Cancer at Diagnosis: A Systematic Review and Meta-Analysis. JAMA Oncol. 2024;10:71–8. doi: 10.1001/jamaoncol.2023.4837 3 Cooper K, Nalbant G, E...
-
[10]
Supplemental Background 1.1 Traditional Survival Analysis Methods Modeling survival outcomes has a long history in statistics. The well-known Kaplan-Meier estimator [1] is one of the early foundational contributions to survival analysis and nonparametric statistics. It estimates the survival function directly from censored data, without any parametric ass...
work page 1999
-
[11]
Supplemental Methods 2.1 Outcome Re-Balancing The Random Over-Sampling Examples (ROSE) technique can ensure a more balanced and representative distribution of outcome [1] and was tested during model training in this study. For ROSE, missingness was encoded as dummy variable (in numerical variables) or as separate category (in categorical variables) to pre...
-
[12]
Predominance of Categorical Covariates: The MA.27 dataset primarily consists of categorical covariates. Tree -based methods, such as Random Survival Forests and Gradient Boosting Machines, are inherently adept at handling categorical variables without extensive preprocessing
-
[13]
High Proportion of Censored Observations and Limited Number of Observations: With over 95% of the patients not experiencing the event within the 5 -year observation period and approximately 7000 datapoints in total, the dataset presents significant challenges and makes deep learning approaches less viable due to their data-intensive nature and complex par...
-
[14]
Limited Concern for Nonlinearity: Tree-based models effectively capture nonlinear relationships, provided they are not excessively complex. In the MA.27 dataset, the limited number of continuous variables minimizes concerns about nonlinearity. Therefore, methods such as support vector machines and neural networks, which are designed to handle more intrica...
-
[15]
If you are unsure, use 0 for cancer on the right-hand side and 2 for cancer on the left-hand side
Natural Handling of Missing Data: Tree-based algorithms manage different missingness types as mentioned in the section on Data Management. This makes them also ideal for developing web- based prediction tools since they can seamlessly manage incomplete user provided data while ensuring robust predictions. In summary, the decision to leverage tree -based m...
work page 2003
-
[16]
ICI and AUC values are presented in the main manuscript
Supplemental Results 3.1 External Evaluation Plots External evaluation was conducted on data from the US SEER program and on the clinical trial dataset TEAM. ICI and AUC values are presented in the main manuscript . Figure 1 and Figure 2 provide the graphical illustration of calibration and discrimination for SEER; Figure 3 and Figure 4 for TEAM. The illu...
-
[17]
Nonparametric Estimation from Incomplete Observations,
E. L. Kaplan and P. Meier, “Nonparametric Estimation from Incomplete Observations,” in Breakthroughs in Statistics, Springer New York, 1992, pp. 319–337. doi: 10.1007/978-1-4612-4380- 9_25
-
[18]
Regression Models and Life-Tables,
D. R. Cox, “Regression Models and Life-Tables,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 34, no. 2, pp. 187–220, 1972
work page 1972
-
[19]
G. Rodriguez, “Parametric survival models,” Int J Comput Algorithm, 2010
work page 2010
-
[20]
Frailty models for survival data,
P. Hougaard, “Frailty models for survival data,” Lifetime Data Analysis, vol. 1, no. 3, pp. 255–273, 1995, doi: 10.1007/bf00985760
-
[21]
L. J. Wei, “The accelerated failure time model: A useful alternative to the cox regression model in survival analysis,” Statistics in Medicine, vol. 11, no. 14–15, pp. 1871–1879, Jan. 1992, doi: 10.1002/sim.4780111409
-
[22]
The temporal scaling of Caenorhabditis elegans ageing,
N. Stroustrup et al., “The temporal scaling of Caenorhabditis elegans ageing,” Nature, vol. 530, no. 7588, pp. 103–107, Jan. 2016, doi: 10.1038/nature16550
-
[23]
H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer, “Random survival forests,” The Annals of Applied Statistics, vol. 2, no. 3, Sep. 2008, doi: 10.1214/08-aoas169. 20/23
-
[24]
T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA: ACM, 2016, pp. 785–794. doi: 10.1145/2939672.2939785
-
[25]
Approximation capabilities of multilayer feedforward networks,
K. Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural Networks, vol. 4, no. 2, pp. 251–257, 1991, doi: 10.1016/0893-6080(91)90009-t
-
[26]
Multilayer feedforward networks are universal approximators,
K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, Jan. 1989, doi: 10.1016/0893- 6080(89)90020-8
-
[27]
URLhttps://doi.org/10.1186/s12874-018-0482-1
J. L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, and Y. Kluger, “DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network,” BMC Medical Research Methodology, vol. 18, no. 1, p. 24, Feb. 2018, doi: 10.1186/s12874-018-0482-1
-
[28]
DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks,
C. Lee, W. Zame, J. Yoon, and M. van der Schaar, “DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, Art. no. 1, Apr. 2018, doi: 10.1609/aaai.v32i1.11842
-
[29]
A. Bennis, S. Mouysset, and M. Serrurier, “DPWTE: A Deep Learning Approach to Survival Analysis Using a Parsimonious Mixture of Weibull Distributions,” in Artificial Neural Networks and Machine Learning – ICANN 2021, Springer International Publishing, 2021, pp. 185–196. doi: 10.1007/978-3- 030-86340-1_15
-
[30]
N. Bice et al., “Deep learning-based survival analysis for brain metastasis patients with the national cancer database,” Journal of Applied Clinical Medical Physics, vol. 21, no. 9, pp. 187–192, Aug. 2020, doi: 10.1002/acm2.12995
-
[31]
A neural network model for survival data,
D. Faraggi and R. Simon, “A neural network model for survival data,” Statistics in Medicine, vol. 14, no. 1, pp. 73–82, 1995, doi: 10.1002/sim.4780140108
-
[32]
Transformer-Based Deep Survival Analysis,
S. Hu, E. Fridgeirsson, G. van Wingen, and M. Welling, “Transformer-Based Deep Survival Analysis,” in Proceedings of AAAI Spring Symposium on Survival Prediction - Algorithms, Challenges, and Applications 2021, R. Greiner, N. Kumar, T. A. Gerds, and M. van der Schaar, Eds., in Proceedings of Machine Learning Research, vol. 146. PMLR, Mar. 2021, pp. 132–14...
work page 2021
-
[33]
DySurv: dynamic deep learning model for survival analysis with conditional variational inference,
M. Mesinovic, P. Watkinson, and T. Zhu, “DySurv: dynamic deep learning model for survival analysis with conditional variational inference,” Journal of the American Medical Informatics Association, p. ocae271, Nov. 2024, doi: 10.1093/jamia/ocae271
-
[34]
S. Wiegrebe, P. Kopper, R. Sonabend, B. Bischl, and A. Bender, “Deep learning for survival analysis: a review,” Artificial Intelligence Review, vol. 57, no. 3, Feb. 2024, doi: 10.1007/s10462-023-10681-3
-
[35]
Support Vector Machines for Survival Analysis with R,
C. J. K. Fouodo, I. R. Knig, C. Weihs, A. Ziegler, and M. N. Wright, “Support Vector Machines for Survival Analysis with R,” The R Journal, vol. 10, no. 1, pp. 412–423, 2018
work page 2018
-
[36]
Survival SVM: a Practical Scalable Algorithm
V. V. Belle, K. Pelckmans, J. A. K. Suykens, and S. V. Huffel, “Survival SVM: a Practical Scalable Algorithm”
-
[37]
J. G. Ibrahim, M.-H. Chen, and D. Sinha, Bayesian Survival Analysis. in Springer Series in Statistics. New York, NY: Springer, 2001. doi: 10.1007/978-1-4757-3447-8
-
[38]
Nonparametric survival analysis using Bayesian Additive Regression Trees (BART),
R. A. Sparapani, B. R. Logan, R. E. McCulloch, and P. W. Laud, “Nonparametric survival analysis using Bayesian Additive Regression Trees (BART),” Statistics in Medicine, vol. 35, no. 16, pp. 2741–2753, Feb. 2016, doi: 10.1002/sim.6893
-
[39]
H. El Haji et al., “Evolution of Breast Cancer Recurrence Risk Prediction: A Systematic Review of Statistical and Machine Learning–Based Models,” JCO Clin Cancer Inform, no. 7, p. e2300049, Aug. 2023, doi: 10.1200/CCI.23.00049
-
[40]
Predicting breast cancer 5-year survival using machine learning: A systematic review,
J. Li et al., “Predicting breast cancer 5-year survival using machine learning: A systematic review,” PLOS ONE, vol. 16, no. 4, p. e0250370, Apr. 2021, doi: 10.1371/journal.pone.0250370. 21/23
-
[41]
Y. Huang, J. Li, M. Li, and R. R. Aparasu, “Application of machine learning in predicting survival outcomes involving real-world data: a scoping review,” BMC Medical Research Methodology, vol. 23, no. 1, Nov. 2023, doi: 10.1186/s12874-023-02078-1
-
[42]
A. Moncada-Torres, M. C. van Maaren, M. P. Hendriks, S. Siesling, and G. Geleijnse, “Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival,” Sci Rep, vol. 11, no. 1, p. 6968, Mar. 2021, doi: 10.1038/s41598-021-86327-7
-
[43]
I. Kurt Omurlu, M. Ture, and F. Tokatli, “The comparisons of random survival forests and Cox regression analysis with simulation and an application related to breast cancer,” Expert Systems with Applications, vol. 36, no. 4, pp. 8582–8588, May 2009, doi: 10.1016/j.eswa.2008.10.023
-
[44]
Application of machine learning in breast cancer survival prediction using a multimethod approach,
S. Z. Hamedi et al., “Application of machine learning in breast cancer survival prediction using a multimethod approach,” Sci Rep, vol. 14, no. 1, p. 30147, Dec. 2024, doi: 10.1038/s41598-024- 81734-y
-
[45]
S. M. Noman et al., “Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis,” Sci Rep, vol. 15, no. 1, p. 3728, Jan. 2025, doi: 10.1038/s41598-025-87622-3
-
[46]
A survey of transfer learning,
K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” J Big Data, vol. 3, no. 1, p. 9, May 2016, doi: 10.1186/s40537-016-0043-6
-
[47]
Transfer learning for medical image classification: a literature review,
H. E. Kim, A. Cosa-Linan, N. Santhanam, M. Jannesari, M. E. Maros, and T. Ganslandt, “Transfer learning for medical image classification: a literature review,” BMC Med Imaging, vol. 22, no. 1, p. 69, Apr. 2022, doi: 10.1186/s12880-022-00793-7
-
[48]
Application of transfer learning for cancer drug sensitivity prediction,
S. R. Dhruba, R. Rahman, K. Matlock, S. Ghosh, and R. Pal, “Application of transfer learning for cancer drug sensitivity prediction,” BMC Bioinformatics, vol. 19, no. 17, p. 497, Dec. 2018, doi: 10.1186/s12859-018-2465-y
-
[49]
J. Wiens, J. Guttag, and E. Horvitz, “A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions,” Journal of the American Medical Informatics Association, vol. 21, no. 4, pp. 699–706, Jul. 2014, doi: 10.1136/amiajnl-2013-002162
-
[50]
G. Sunilkumar and P. Kumaresan, “Deep Learning and Transfer Learning in Cardiology: A Review of Cardiovascular Disease Prediction Models,” IEEE Access, vol. 12, pp. 193365–193386, 2024, doi: 10.1109/ACCESS.2024.3514093
-
[51]
S. Khan, N. Islam, Z. Jan, I. Ud Din, and J. J. P. C. Rodrigues, “A novel deep learning based framework for the detection and classification of breast cancer using transfer learning,” Pattern Recognition Letters, vol. 125, pp. 1–6, Jul. 2019, doi: 10.1016/j.patrec.2019.03.022
-
[52]
Boosting Transfer Learning with Survival Data from Heterogeneous Domains,
A. Bellot and M. van der Schaar, “Boosting Transfer Learning with Survival Data from Heterogeneous Domains,” in Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR, Apr. 2019, pp. 57–65. Accessed: Jul. 30, 2025. [Online]. Available: https://proceedings.mlr.press/v89/bellot19a.html
work page 2019
-
[53]
Predicting clinical outcomes of ovarian cancer patients: deep survival models and transfer learning,
E. S. Menand, N. Jrad, J.-M. Marion, A. Morel, and P. Chauvet, “Predicting clinical outcomes of ovarian cancer patients: deep survival models and transfer learning,” presented at the 31st European Safety and Reliability Conference (ESREL 2021), Sep. 2021. doi: 10.3850/978-981-18-2016- 8
-
[54]
F. Zhu et al., “Development and validation of a deep transfer learning-based multivariable survival model to predict overall survival in lung cancer,” Transl Lung Cancer Res, vol. 12, no. 3, pp. 471–482, Mar. 2023, doi: 10.21037/tlcr-23-84
-
[55]
Improved survival analysis by learning shared genomic information from pan-cancer data,
S. Kim, K. Kim, J. Choe, I. Lee, and J. Kang, “Improved survival analysis by learning shared genomic information from pan-cancer data,” Bioinformatics, vol. 36, no. Suppl_1, pp. i389–i398, Jul. 2020, doi: 10.1093/bioinformatics/btaa462. 22/23
-
[56]
CNN-based survival model for pancreatic ductal adenocarcinoma in medical imaging,
Y. Zhang, E. M. Lobo-Mueller, P. Karanicolas, S. Gallinger, M. A. Haider, and F. Khalvati, “CNN-based survival model for pancreatic ductal adenocarcinoma in medical imaging,” BMC Med Imaging, vol. 20, no. 1, p. 11, Feb. 2020, doi: 10.1186/s12880-020-0418-1
-
[57]
An updated PREDICT breast cancer prognostic model including the benefits and harms of radiotherapy,
I. Grootes, G. C. Wishart, and P. D. P. Pharoah, “An updated PREDICT breast cancer prognostic model including the benefits and harms of radiotherapy,” npj Breast Cancer, vol. 10, no. 1, Jan. 2024, doi: 10.1038/s41523-024-00612-y
-
[58]
E. Chen et al., “Insights into the performance of PREDICT tool in a large Mainland Chinese breast cancer cohort: a comparative analysis of versions 3.0 and 2.2,” The Oncologist, vol. 29, no. 8, pp. e976–e983, Jun. 2024, doi: 10.1093/oncolo/oyae164
-
[59]
Validation of the PREDICT Breast Version 3.0 Prognostic Tool in US Breast Cancer Patients,
Y.-W. Hsiao, G. C. Wishart, P. D. P. Pharoah, and P.-C. Peng, “Validation of the PREDICT Breast Version 3.0 Prognostic Tool in US Breast Cancer Patients,” Oct. 2024, doi: 10.1101/2024.10.29.24316401
-
[60]
R. B. Basmadjian, Y. Xu, M. L. Quan, S. Lupichuk, W. Y. Cheung, and D. R. Brenner, “Evaluating PREDICT and developing outcome prediction models in early-onset breast cancer using data from Alberta, Canada,” Breast Cancer Res Treat, vol. 211, no. 2, pp. 399–408, Jun. 2025, doi: 10.1007/s10549-025-07654-1
-
[61]
N. Lunardon, G. Menardi, and N. Torelli, ROSE: Random Over-Sampling Examples. (Jun. 14, 2021). Accessed: May 08, 2025. [Online]. Available: https://cran.r- project.org/web/packages/ROSE/index.html
work page 2021
-
[62]
van Buuren et al., mice: Multivariate Imputation by Chained Equations
S. van Buuren et al., mice: Multivariate Imputation by Chained Equations. (Nov. 27, 2024). Accessed: May 08, 2025. [Online]. Available: https://cran.r-project.org/web/packages/mice/index.html
work page 2024
-
[63]
H. Ishwaran and U. B. Kogalur, randomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC). (Jan. 16, 2025). Accessed: May 08, 2025. [Online]. Available: https://cran.r-project.org/web/packages/randomForestSRC/index.html
work page 2025
-
[64]
Chen et al., xgboost: Extreme Gradient Boosting
T. Chen et al., xgboost: Extreme Gradient Boosting. (Apr. 22, 2025). Accessed: May 08, 2025. [Online]. Available: https://cran.r-project.org/web/packages/xgboost/index.html
work page 2025
-
[65]
“pengpclab/PREDICTv3: This R package provides an updated implementation of the PREDICT Breast cancer prognostication model v3.0.” Accessed: May 08, 2025. [Online]. Available: https://github.com/pengpclab/PREDICTv3/tree/main
work page 2025
-
[66]
A Simplex Method for Function Minimization,
J. A. Nelder and R. Mead, “A Simplex Method for Function Minimization,” The Computer Journal, vol. 7, no. 4, pp. 308–313, Jan. 1965, doi: 10.1093/comjnl/7.4.308
-
[67]
Bolar, STAT: Interactive Document for Working with Basic Statistical Analysis
K. Bolar, STAT: Interactive Document for Working with Basic Statistical Analysis. (Apr. 01, 2019). Accessed: Jul. 22, 2025. [Online]. Available: https://cran.r- project.org/web/packages/STAT/index.html
work page 2019
-
[68]
P. E. Goss et al., “Exemestane versus anastrozole in postmenopausal women with early breast cancer: NCIC CTG MA.27--a randomized controlled phase III trial,” J Clin Oncol, vol. 31, no. 11, pp. 1398–1404, Apr. 2013, doi: 10.1200/JCO.2012.44.7805
-
[69]
Canadian Tobacco Use Monitoring Survey: Smoking in Canada: An Overview.,
Health Canada, “Canadian Tobacco Use Monitoring Survey: Smoking in Canada: An Overview.,”
-
[70]
Available: https://publications.gc.ca/Collection/H12-35-2003-1E.pdf
[Online]. Available: https://publications.gc.ca/Collection/H12-35-2003-1E.pdf
work page 2003
-
[71]
Breast Cancer HER2 Status | What is HER2 Status?
“Breast Cancer HER2 Status | What is HER2 Status?” Accessed: May 07, 2025. [Online]. Available: https://www.cancer.org/cancer/types/breast-cancer/understanding-a-breast-cancer- diagnosis/breast-cancer-her2-status.html
work page 2025
-
[72]
A review of FDA approved drugs and their formulations for the treatment of breast cancer,
M. Chaurasia, R. Singh, S. Sur, and S. J. S. Flora, “A review of FDA approved drugs and their formulations for the treatment of breast cancer,” Front Pharmacol, vol. 14, p. 1184472, 2023, doi: 10.3389/fphar.2023.1184472
-
[73]
J. Crown and M. O’Leary, “The taxanes: an update,” The Lancet, vol. 355, no. 9210, pp. 1176–1178, Apr. 2000, doi: 10.1016/S0140-6736(00)02074-2. 23/23
-
[74]
Taxanes in the Adjuvant Treatment of Breast Cancer: Why Not Yet?,
M. J. Piccart, C. Lohrisch, L. Duchateau, and M. Buyse, “Taxanes in the Adjuvant Treatment of Breast Cancer: Why Not Yet?,” JNCI Monographs, vol. 2001, no. 30, pp. 88–95, Dec. 2001, doi: 10.1093/oxfordjournals.jncimonographs.a003468
work page doi:10.1093/oxfordjournals.jncimonographs.a003468 2001
-
[75]
Optimizing Adjuvant Breast Cancer Chemotherapy: Rationale for the MA.21 Study,
M. D. Maureen E. Trudeau, “Optimizing Adjuvant Breast Cancer Chemotherapy: Rationale for the MA.21 Study,” vol. 15, May 2001, Accessed: Apr. 25, 2025. [Online]. Available: https://www.cancernetwork.com/view/optimizing-adjuvant-breast-cancer-chemotherapy- rationale-ma21-study
work page 2001
-
[76]
M. Burnell et al., “Cyclophosphamide, Epirubicin, and Fluorouracil Versus Dose-Dense Epirubicin and Cyclophosphamide Followed by Paclitaxel Versus Doxorubicin and Cyclophosphamide Followed by Paclitaxel in Node-Positive or High-Risk Node-Negative Breast Cancer,” JCO, vol. 28, no. 1, pp. 77– 82, Jan. 2010, doi: 10.1200/JCO.2009.22.1077
-
[77]
Adjuvant tamoxifen and exemestane in early breast cancer (TEAM): a randomised phase 3 trial,
C. J. H. van de Velde et al., “Adjuvant tamoxifen and exemestane in early breast cancer (TEAM): a randomised phase 3 trial,” Lancet, vol. 377, no. 9762, pp. 321–331, Jan. 2011, doi: 10.1016/S0140- 6736(10)62312-4
-
[78]
Yan, rBayesianOptimization: Bayesian Optimization of Hyperparameters
Y. Yan, rBayesianOptimization: Bayesian Optimization of Hyperparameters. (Apr. 13, 2024). Accessed: Jul. 22, 2025. [Online]. Available: https://cran.r- project.org/web/packages/rBayesianOptimization/
work page 2024
-
[79]
Graphical calibration curves and the integrated calibration index (ICI) for survival models,
P. C. Austin, F. E. Harrell Jr, and D. van Klaveren, “Graphical calibration curves and the integrated calibration index (ICI) for survival models,” Statistics in Medicine, vol. 39, no. 21, pp. 2714–2742, 2020, doi: 10.1002/sim.8570
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.