Systematic evaluation of over 100 ML4H papers finds poorer reproducibility than other ML fields, driven by limited data and code access, and offers recommendations to data providers, publishers, and researchers.
Generalizability of predictive models for intensive care unit patients
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
A large volume of research has considered the creation of predictive models for clinical data; however, much existing literature reports results using only a single source of data. In this work, we evaluate the performance of models trained on the publicly-available eICU Collaborative Research Database. We show that cross-validation using many distinct centers provides a reasonable estimate of model performance in new centers. We further show that a single model trained across centers transfers well to distinct hospitals, even compared to a model retrained using hospital-specific data. Our results motivate the use of multi-center datasets for model development and highlight the need for data sharing among hospitals to maximize model performance.
fields
cs.LG 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Reproducibility in Machine Learning for Health
Systematic evaluation of over 100 ML4H papers finds poorer reproducibility than other ML fields, driven by limited data and code access, and offers recommendations to data providers, publishers, and researchers.