Automated Machine Learning: State-of-The-Art and Open Challenges

Mohamed Maher; Radwa Elshawi; Sherif Sakr

arxiv: 1906.02287 · v2 · pith:J6VLJN3Dnew · submitted 2019-06-05 · 💻 cs.LG · stat.ML

Automated Machine Learning: State-of-The-Art and Open Challenges

Radwa Elshawi , Mohamed Maher , Sherif Sakr This is my paper

classification 💻 cs.LG stat.ML

keywords learningmachineautomatingbeenchallengesdatadomainprocess

0 comments

read the original abstract

With the continuous and vast increase in the amount of data in our digital world, it has been acknowledged that the number of knowledgeable data scientists can not scale to address these challenges. Thus, there was a crucial need for automating the process of building good machine learning models. In the last few years, several techniques and frameworks have been introduced to tackle the challenge of automating the process of Combined Algorithm Selection and Hyper-parameter tuning (CASH) in the machine learning domain. The main aim of these techniques is to reduce the role of the human in the loop and fill the gap for non-expert machine learning users by playing the role of the domain expert. In this paper, we present a comprehensive survey for the state-of-the-art efforts in tackling the CASH problem. In addition, we highlight the research work of automating the other steps of the full complex machine learning pipeline (AutoML) from data understanding till model deployment. Furthermore, we provide comprehensive coverage for the various tools and frameworks that have been introduced in this domain. Finally, we discuss some of the research directions and open challenges that need to be addressed in order to achieve the vision and goals of the AutoML process.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Two-stage Optimization for Machine Learning Workflow
cs.LG 2019-07 unverdicted novelty 4.0

Two-stage optimization for ML workflows that prioritizes data pipeline search over hyperparameter tuning, with time-allocation policies and a specificity metric for pruning.