Automated Machine Learning in Practice: State of the Art and Recent Results

Anastasia Varlet; Christian Westermann; Katharina Rombach; Lukas Tuggener; Mohammadreza Amirian; Stefan L\"orwald; Thilo Stadelmann

arxiv: 1907.08392 · v1 · pith:WA74GXO6new · submitted 2019-07-19 · 💻 cs.LG · cs.AI· stat.ML

Automated Machine Learning in Practice: State of the Art and Recent Results

Lukas Tuggener , Mohammadreza Amirian , Katharina Rombach , Stefan L\"orwald , Anastasia Varlet , Christian Westermann , Thilo Stadelmann This is my paper

Pith reviewed 2026-05-24 19:15 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords AutoMLautomated machine learningbenchmarksstate of the artpractical applicabilitybusiness contextmachine learning pipelines

0 comments

The pith

AutoML methods automate model building and deliver competitive results on business tasks per current benchmarks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reviews the state of the art in automated machine learning with emphasis on practical use in business settings. It surveys leading algorithms and reports recent benchmark results comparing their performance. Growing demand for machine learning skills drives interest in these tools because they aim to reduce reliance on scarce expert labor. The focus on applicability helps identify which systems can handle real deployment without extensive manual work. Readers gain a map of current options and evidence on how well they perform on representative tasks.

Core claim

This paper gives an overview of the state of the art in AutoML with a focus on practical applicability in a business context, and provides recent benchmark results on the most important AutoML algorithms.

What carries the argument

Empirical benchmarks comparing leading AutoML frameworks on datasets chosen to reflect business use cases.

If this is right

Organizations can apply AutoML tools to build predictive models with reduced need for specialized data scientists.
Certain AutoML frameworks show consistent accuracy across preprocessing, feature selection, and model choice steps.
Benchmark results identify which algorithms handle typical business data volumes and feature types effectively.
The overview supports decisions on tool selection by showing trade-offs in speed, accuracy, and ease of use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Wider adoption could speed up digitization projects in sectors with limited ML talent pools.
Extending benchmarks to time-series or unstructured data common in industry would test broader utility.
Combining AutoML outputs with domain-specific constraints might improve results on regulated business problems.

Load-bearing premise

The selected AutoML algorithms and benchmark tasks are representative of the most important methods and real-world business use cases.

What would settle it

A new set of benchmarks on diverse proprietary business datasets where all surveyed AutoML systems underperform manual expert tuning by a wide margin would undermine the claim of practical applicability.

Figures

Figures reproduced from arXiv: 1907.08392 by Anastasia Varlet, Christian Westermann, Katharina Rombach, Lukas Tuggener, Mohammadreza Amirian, Stefan L\"orwald, Thilo Stadelmann.

**Figure 1.** Figure 1: Schematic overview of the Portfolio Hyperband workflow. Portfolio Hyperband [13], [43]: Inspired by PoSH Autosklearn [43] that combines a portfolio of initial configurations with successive halving (SH) and Bayesian optimization, we tested a system that combines a portfolio with Hyperband [13]. Our goal was to combine the portfolio variant of metalearning, which is very simple and fast, with Hyperband th… view at source ↗

read the original abstract

A main driver behind the digitization of industry and society is the belief that data-driven model building and decision making can contribute to higher degrees of automation and more informed decisions. Building such models from data often involves the application of some form of machine learning. Thus, there is an ever growing demand in work force with the necessary skill set to do so. This demand has given rise to a new research topic concerned with fitting machine learning models fully automatically - AutoML. This paper gives an overview of the state of the art in AutoML with a focus on practical applicability in a business context, and provides recent benchmark results on the most important AutoML algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a survey that organizes existing AutoML tools around business use and adds benchmark numbers, but the numbers come with no methods or selection details.

read the letter

The paper's core is an overview of AutoML methods aimed at practical business settings, plus some benchmark results on prominent algorithms. It collects work on Bayesian optimization, evolutionary approaches, and related techniques, which can give a reader a starting map of what's available without digging through dozens of papers themselves. That organizing function is the main thing it does, and it is straightforward enough for that purpose. The business focus is a reasonable framing even if it does not lead to new technical claims. The benchmarks are presented as recent results on the most important algorithms, but the text supplies no description of the datasets, metrics, run settings, or error analysis. There is also no account of how the algorithms were selected or why they cover the main paradigms. That matches the stress-test point: without an inclusion protocol or argument for representativeness, the claim that these are the key methods for business contexts cannot be checked. The paper does not introduce new mechanisms or resolve open questions; it documents what already exists. A reader looking for orientation on current tools might get some value, but anyone needing reproducible numbers or a systematic review will have to look elsewhere. I would bring this to a reading group only if the group is specifically scanning applied tool surveys. I would not cite it in my own work. It is coherent on its own terms as a survey, so it deserves peer review once the benchmark section is expanded with methods and selection criteria.

Referee Report

2 major / 0 minor

Summary. The paper provides an overview of the state of the art in Automated Machine Learning (AutoML), with emphasis on practical applicability in business contexts, and presents recent benchmark results on the most important AutoML algorithms.

Significance. If the benchmark results are reproducible and the selected algorithms and tasks are defensible as representative, the work could help practitioners identify suitable AutoML tools; however, the current lack of methodological detail and selection criteria reduces its value as a reliable reference.

major comments (2)

[Abstract] Abstract: the claim of providing 'recent benchmark results on the most important AutoML algorithms' is load-bearing for the paper's contribution, yet the manuscript supplies no description of the experimental methodology, chosen datasets, performance metrics, statistical tests, or error analysis, rendering the results unverifiable.
The central claim that the paper covers 'the most important AutoML algorithms' and benchmarks relevant to business use cases requires a documented selection protocol; no inclusion/exclusion criteria, systematic literature search description, or argument for coverage of dominant paradigms (Bayesian optimization, evolutionary methods, meta-learning, NAS) or business constraints (imbalance, missing data, interpretability) is supplied.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the need for greater methodological transparency. We address each major comment below and will revise the manuscript to strengthen these aspects.

read point-by-point responses

Referee: [Abstract] Abstract: the claim of providing 'recent benchmark results on the most important AutoML algorithms' is load-bearing for the paper's contribution, yet the manuscript supplies no description of the experimental methodology, chosen datasets, performance metrics, statistical tests, or error analysis, rendering the results unverifiable.

Authors: We agree that the abstract's emphasis on benchmark results requires supporting methodological detail to ensure verifiability. In the revision we will insert a new subsection (likely Section 4 or equivalent) that explicitly describes the experimental methodology, the chosen datasets and their characteristics, the performance metrics employed, the statistical tests used for comparisons, and any error or sensitivity analysis conducted. revision: yes
Referee: The central claim that the paper covers 'the most important AutoML algorithms' and benchmarks relevant to business use cases requires a documented selection protocol; no inclusion/exclusion criteria, systematic literature search description, or argument for coverage of dominant paradigms (Bayesian optimization, evolutionary methods, meta-learning, NAS) or business constraints (imbalance, missing data, interpretability) is supplied.

Authors: We concur that a documented selection protocol is needed to substantiate coverage of the most important algorithms and business-relevant constraints. The revised manuscript will add a dedicated subsection outlining the literature search strategy, explicit inclusion/exclusion criteria, and a rationale showing how the selected methods represent the dominant paradigms (Bayesian optimization, evolutionary methods, meta-learning, NAS) while addressing practical business issues such as class imbalance, missing data, and interpretability requirements. revision: yes

Circularity Check

0 steps flagged

Survey paper with no internal derivations exhibits no circularity

full rationale

This manuscript is a literature survey and benchmark report on AutoML methods drawn from external sources. It contains no mathematical derivations, predictions, or fitted parameters that could reduce to quantities defined within the paper itself. The selection of algorithms and tasks, while potentially open to critique on representativeness, does not constitute circularity under the defined criteria, as no load-bearing claim reduces by construction to self-defined inputs. The paper is self-contained against external benchmarks and literature.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper; the abstract introduces no new free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5662 in / 953 out tokens · 20004 ms · 2026-05-24T19:15:44.507336+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 3 internal anchors

[1]

Braschler, K

M. Braschler, K. Stockinger, and T. Stadelmann (Eds.), Applied Data Science—Lessons Learned for the Data-Driven Business . Springer International Publishing, 2019

work page 2019
[2]

Learning neural models for end-to-end clustering,

B. B. Meier, I. Elezi, M. Amirian, O. D ¨urr, and T. Stadelmann, “Learning neural models for end-to-end clustering,” in IAPR Workshop on Artiﬁcial Neural Networks in Pattern Recognition , pp. 126–138, Springer, 2018

work page 2018
[3]

Automatic machine learn- ing: methods, systems, challenges,

F. Hutter, L. Kotthoff, and J. Vanschoren, “Automatic machine learn- ing: methods, systems, challenges,” Challenges in Machine Learning , 2019

work page 2019
[4]

Auto- weka: Combined selection and hyperparameter optimization of clas- siﬁcation algorithms,

C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Auto- weka: Combined selection and hyperparameter optimization of clas- siﬁcation algorithms,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 847–855, ACM, 2013

work page 2013
[5]

Machine learning for predictive maintenance: A multiple classiﬁer approach,

G. A. Susto, A. Schirru, S. Pampuri, S. McLoone, and A. Beghi, “Machine learning for predictive maintenance: A multiple classiﬁer approach,” IEEE Transactions on Industrial Informatics, vol. 11, no. 3, pp. 812–820, 2015

work page 2015
[6]

Improving rail network velocity: A machine learning approach to predictive maintenance,

H. Li, D. Parikh, Q. He, B. Qian, Z. Li, D. Fang, and A. Hampapur, “Improving rail network velocity: A machine learning approach to predictive maintenance,” Transportation Research Part C: Emerging Technologies, vol. 45, pp. 17–26, 2014

work page 2014
[7]

Machine learning algorithms for damage detection under operational and environmental variability,

E. Figueiredo, G. Park, C. R. Farrar, K. Worden, and J. Figueiras, “Machine learning algorithms for damage detection under operational and environmental variability,” Structural Health Monitoring , vol. 10, no. 6, pp. 559–572, 2011

work page 2011
[8]

Frame- work for personalized prediction of treatment response in relapsing remitting multiple sclerosis,

E. St ¨uhler, S. Braune, F. Lionetto, Y . Heer, P. Kassraian-Fard, E. Jules, C. Westermann, A. Bergmann, P. van Hvell, and N. S. Group, “Frame- work for personalized prediction of treatment response in relapsing remitting multiple sclerosis,” BMC medical research methodology , submitted

work page
[9]

How neural networks can help loan ofﬁcers to make better informed application decisions,

M. Handzic, F. Tjandrawibawa, and J. Yeo, “How neural networks can help loan ofﬁcers to make better informed application decisions,” Informing Science, vol. 6, pp. 97–109, 2003

work page 2003
[10]

Auto claim fraud detec- tion using bayesian learning neural networks,

S. Viaene, G. Dedene, and R. A. Derrig, “Auto claim fraud detec- tion using bayesian learning neural networks,” Expert Systems with Applications, vol. 29, no. 3, pp. 653–666, 2005

work page 2005
[11]

Consolidated tree classiﬁer learning in a car insurance fraud detection domain with class imbalance,

J. M. P ´erez, J. Muguerza, O. Arbelaitz, I. Gurrutxaga, and J. I. Mart ´ın, “Consolidated tree classiﬁer learning in a car insurance fraud detection domain with class imbalance,” in International Conference on Pattern Recognition and Image Analysis , pp. 381–389, Springer, 2005

work page 2005
[12]

A survey of machine learning techniques for food sales prediction,

G. Tsoumakas, “A survey of machine learning techniques for food sales prediction,” Artiﬁcial Intelligence Review , pp. 1–7, 2018

work page 2018
[13]

Hyperband: A novel bandit-based approach to hyperparameter opti- mization,

L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A novel bandit-based approach to hyperparameter opti- mization,” The Journal of Machine Learning Research , vol. 18, no. 1, pp. 6765–6816, 2017

work page 2017
[14]

Automated generation and selection of interpretable features for enterprise security,

J. Duan, Z. Zeng, A. Oprea, and S. Vasudevan, “Automated generation and selection of interpretable features for enterprise security,” in 2018 IEEE International Conference on Big Data (Big Data) , pp. 1258– 1265, IEEE, 2018

work page 2018
[15]

Learning to learn by gradient descent by gradient descent,

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. De Freitas, “Learning to learn by gradient descent by gradient descent,” in Advances in Neural Information Processing Systems , pp. 3981–3989, 2016

work page 2016
[16]

Neural architecture search with reinforcement learning,

B. Zoph and Q. V . Le, “Neural architecture search with reinforcement learning,” in Proceedings of International Conference on Learning Representations (ICLR), 2017

work page 2017
[17]

Efﬁcient and robust automated machine learning,

M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, and F. Hutter, “Efﬁcient and robust automated machine learning,” in Advances in Neural Information Processing Systems , pp. 2962–2970, 2015

work page 2015
[18]

Feature selection as a one-player game,

R. Gaudel and M. Sebag, “Feature selection as a one-player game,” in International Conference on Machine Learning , pp. 359–366, 2010

work page 2010
[19]

Explorekit: Automatic feature generation and selection,

G. Katz, E. C. R. Shin, and D. Song, “Explorekit: Automatic feature generation and selection,” in Data Mining (ICDM), 2016 IEEE 16th International Conference on , pp. 979–984, IEEE, 2016

work page 2016
[20]

Learning feature engineering for classiﬁcation,

F. Nargesian, H. Samulowitz, U. Khurana, E. B. Khalil, and D. Turaga, “Learning feature engineering for classiﬁcation,” in Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence, IJCAI, vol. 17, pp. 2529–2535, 2017

work page 2017
[21]

Autolearnautomated feature generation and selection,

A. Kaul, S. Maheshwary, and V . Pudi, “Autolearnautomated feature generation and selection,” in Data Mining (ICDM), 2017 IEEE Inter- national Conference on , pp. 217–226, IEEE, 2017

work page 2017
[22]

Stability selection,

N. Meinshausen and P. B ¨uhlmann, “Stability selection,” Journal of the Royal Statistical Society: Series B (Statistical Methodology) , vol. 72, no. 4, pp. 417–473, 2010

work page 2010
[23]

Meta-learning by landmarking various learning algorithms.,

B. Pfahringer, H. Bensusan, and C. G. Giraud-Carrier, “Meta-learning by landmarking various learning algorithms.,” in ICML, pp. 743–750, 2000

work page 2000
[24]

Learning curve prediction with bayesian neural networks,

A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter, “Learning curve prediction with bayesian neural networks,” 2016

work page 2016
[25]

Neural networks for predicting algorithm runtime distributions.,

K. Eggensperger, M. Lindauer, and F. Hutter, “Neural networks for predicting algorithm runtime distributions.,” in IJCAI, pp. 1442–1448, 2018

work page 2018
[26]

A comparison of ranking methods for classiﬁcation algorithm selection,

P. B. Brazdil and C. Soares, “A comparison of ranking methods for classiﬁcation algorithm selection,” inEuropean conference on machine learning, pp. 63–75, Springer, 2000

work page 2000
[27]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997

work page 1997
[28]

Learning to learn without gradient descent by gradient descent,

Y . Chen, M. W. Hoffman, S. G. Colmenarejo, M. Denil, T. P. Lillicrap, M. Botvinick, and N. de Freitas, “Learning to learn without gradient descent by gradient descent,” in Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pp. 748–756, JMLR. org, 2017

work page 2017
[29]

Support-vector networks,

C. Cortes and V . Vapnik, “Support-vector networks,” Machine learn- ing, vol. 20, no. 3, pp. 273–297, 1995

work page 1995
[30]

Simple and efﬁcient archi- tecture search for convolutional neural networks,

T. Elsken, J.-H. Metzen, and F. Hutter, “Simple and efﬁcient archi- tecture search for convolutional neural networks,” in Proceedings of International Conference on Learning Representations (ICLR) , 2018

work page 2018
[31]

Large-scale evolution of image classiﬁers,

E. Real, S. Moore, A. Selle, S. Saxena, Y . L. Suematsu, J. Tan, Q. V . Le, and A. Kurakin, “Large-scale evolution of image classiﬁers,” in Proceedings of the 34th International Conference on Machine Learning (D. Precup and Y . W. Teh, eds.), vol. 70 of Proceedings of Machine Learning Research , (International Convention Centre, Sydney, Australia), pp. 29...

work page 2017
[32]

Amc: Automl for model compression and acceleration on mobile devices,

Y . He, J. Lin, Z. Liu, H. Wang, L.-J. Li, and S. Han, “Amc: Automl for model compression and acceleration on mobile devices,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 784–800, 2018

work page 2018
[33]

Analysis of the automl challenge series 2015-2018,

I. Guyon, L. Sun-Hosoya, M. Boull ´e, H. Escalante, S. Escalera, Z. Liu, D. Jajetic, B. Ray, M. Saeed, M. Sebag, et al., “Analysis of the automl challenge series 2015-2018,” 2017

work page 2015
[34]

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

E. Brochu, V . M. Cora, and N. De Freitas, “A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning,” arXiv preprint arXiv:1012.2599, 2010

work page internal anchor Pith review Pith/arXiv arXiv 2010
[35]

Sequential model-based optimization for general algorithm conﬁguration,

F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-based optimization for general algorithm conﬁguration,” in International Conference on Learning and Intelligent Optimization , pp. 507–523, Springer, 2011

work page 2011
[36]

Using meta-learning to initialize bayesian optimization of hyperparameters,

M. Feurer, J. T. Springenberg, and F. Hutter, “Using meta-learning to initialize bayesian optimization of hyperparameters,” in Proceedings of the 2014 International Conference on Meta-learning and Algorithm Selection-Volume 1201, pp. 3–10, Citeseer, 2014

work page 2014
[37]

Non-stochastic best arm identiﬁca- tion and hyperparameter optimization,

K. Jamieson and A. Talwalkar, “Non-stochastic best arm identiﬁca- tion and hyperparameter optimization,” in Artiﬁcial Intelligence and Statistics, pp. 240–248, 2016

work page 2016
[38]

Population Based Training of Neural Networks

M. Jaderberg, V . Dalibard, S. Osindero, W. M. Czarnecki, J. Don- ahue, A. Razavi, O. Vinyals, T. Green, I. Dunning, K. Simonyan, et al., “Population based training of neural networks,” arXiv preprint arXiv:1711.09846, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[39]

Gradient-based hyper- parameter optimization through reversible learning,

D. Maclaurin, D. Duvenaud, and R. Adams, “Gradient-based hyper- parameter optimization through reversible learning,” in International Conference on Machine Learning , pp. 2113–2122, 2015

work page 2015
[40]

Banzhaf, P

W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone, Genetic pro- gramming: an introduction, vol. 1. Morgan Kaufmann San Francisco, 1998

work page 1998
[41]

The kernel trick for distances,

B. Sch ¨olkopf, “The kernel trick for distances,” in Advances in neural information processing systems , pp. 301–307, 2001

work page 2001
[42]

Atm: A distributed, collaborative, scalable system for automated machine learning,

T. Swearingen, W. Drevo, B. Cyphers, A. Cuesta-Infante, A. Ross, and K. Veeramachaneni, “Atm: A distributed, collaborative, scalable system for automated machine learning,” in IEEE International Con- ference on Big Data , 2017

work page 2017
[43]

Practical automated machine learning for the automl challenge 2018,

M. Feurer, K. Eggensperger, S. Falkner, M. Lindauer, and F. Hutter, “Practical automated machine learning for the automl challenge 2018,” in International Workshop on Automatic Machine Learning at ICML , 2018

work page 2018
[44]

Deep learning in the wild,

T. Stadelmann, M. Amirian, I. Arabaci, M. Arnold, G. F. Duivesteijn, I. Elezi, M. Geiger, S. L ¨orwald, B. B. Meier, K. Rombach, et al. , “Deep learning in the wild,” in IAPR Workshop on Artiﬁcial Neural Networks in Pattern Recognition , pp. 17–38, Springer, 2018

work page 2018
[45]

Automating biomedical data science through tree-based pipeline optimization,

R. S. Olson, R. J. Urbanowicz, P. C. Andrews, N. A. Lavender, J. H. Moore, et al., “Automating biomedical data science through tree-based pipeline optimization,” in European Conference on the Applications of Evolutionary Computation, pp. 123–137, Springer, 2016

work page 2016
[46]

Openml: Net- worked science in machine learning,

J. Vanschoren, J. N. van Rijn, B. Bischl, and L. Torgo, “Openml: Net- worked science in machine learning,” SIGKDD Explorations, vol. 15, no. 2, pp. 49–60, 2013

work page 2013
[47]

Learning to Optimize

K. Li and J. Malik, “Learning to optimize,” arXiv preprint arXiv:1606.01885, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[1] [1]

Braschler, K

M. Braschler, K. Stockinger, and T. Stadelmann (Eds.), Applied Data Science—Lessons Learned for the Data-Driven Business . Springer International Publishing, 2019

work page 2019

[2] [2]

Learning neural models for end-to-end clustering,

B. B. Meier, I. Elezi, M. Amirian, O. D ¨urr, and T. Stadelmann, “Learning neural models for end-to-end clustering,” in IAPR Workshop on Artiﬁcial Neural Networks in Pattern Recognition , pp. 126–138, Springer, 2018

work page 2018

[3] [3]

Automatic machine learn- ing: methods, systems, challenges,

F. Hutter, L. Kotthoff, and J. Vanschoren, “Automatic machine learn- ing: methods, systems, challenges,” Challenges in Machine Learning , 2019

work page 2019

[4] [4]

Auto- weka: Combined selection and hyperparameter optimization of clas- siﬁcation algorithms,

C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Auto- weka: Combined selection and hyperparameter optimization of clas- siﬁcation algorithms,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 847–855, ACM, 2013

work page 2013

[5] [5]

Machine learning for predictive maintenance: A multiple classiﬁer approach,

G. A. Susto, A. Schirru, S. Pampuri, S. McLoone, and A. Beghi, “Machine learning for predictive maintenance: A multiple classiﬁer approach,” IEEE Transactions on Industrial Informatics, vol. 11, no. 3, pp. 812–820, 2015

work page 2015

[6] [6]

Improving rail network velocity: A machine learning approach to predictive maintenance,

H. Li, D. Parikh, Q. He, B. Qian, Z. Li, D. Fang, and A. Hampapur, “Improving rail network velocity: A machine learning approach to predictive maintenance,” Transportation Research Part C: Emerging Technologies, vol. 45, pp. 17–26, 2014

work page 2014

[7] [7]

Machine learning algorithms for damage detection under operational and environmental variability,

E. Figueiredo, G. Park, C. R. Farrar, K. Worden, and J. Figueiras, “Machine learning algorithms for damage detection under operational and environmental variability,” Structural Health Monitoring , vol. 10, no. 6, pp. 559–572, 2011

work page 2011

[8] [8]

Frame- work for personalized prediction of treatment response in relapsing remitting multiple sclerosis,

E. St ¨uhler, S. Braune, F. Lionetto, Y . Heer, P. Kassraian-Fard, E. Jules, C. Westermann, A. Bergmann, P. van Hvell, and N. S. Group, “Frame- work for personalized prediction of treatment response in relapsing remitting multiple sclerosis,” BMC medical research methodology , submitted

work page

[9] [9]

How neural networks can help loan ofﬁcers to make better informed application decisions,

M. Handzic, F. Tjandrawibawa, and J. Yeo, “How neural networks can help loan ofﬁcers to make better informed application decisions,” Informing Science, vol. 6, pp. 97–109, 2003

work page 2003

[10] [10]

Auto claim fraud detec- tion using bayesian learning neural networks,

S. Viaene, G. Dedene, and R. A. Derrig, “Auto claim fraud detec- tion using bayesian learning neural networks,” Expert Systems with Applications, vol. 29, no. 3, pp. 653–666, 2005

work page 2005

[11] [11]

Consolidated tree classiﬁer learning in a car insurance fraud detection domain with class imbalance,

J. M. P ´erez, J. Muguerza, O. Arbelaitz, I. Gurrutxaga, and J. I. Mart ´ın, “Consolidated tree classiﬁer learning in a car insurance fraud detection domain with class imbalance,” in International Conference on Pattern Recognition and Image Analysis , pp. 381–389, Springer, 2005

work page 2005

[12] [12]

A survey of machine learning techniques for food sales prediction,

G. Tsoumakas, “A survey of machine learning techniques for food sales prediction,” Artiﬁcial Intelligence Review , pp. 1–7, 2018

work page 2018

[13] [13]

Hyperband: A novel bandit-based approach to hyperparameter opti- mization,

L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A novel bandit-based approach to hyperparameter opti- mization,” The Journal of Machine Learning Research , vol. 18, no. 1, pp. 6765–6816, 2017

work page 2017

[14] [14]

Automated generation and selection of interpretable features for enterprise security,

J. Duan, Z. Zeng, A. Oprea, and S. Vasudevan, “Automated generation and selection of interpretable features for enterprise security,” in 2018 IEEE International Conference on Big Data (Big Data) , pp. 1258– 1265, IEEE, 2018

work page 2018

[15] [15]

Learning to learn by gradient descent by gradient descent,

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. De Freitas, “Learning to learn by gradient descent by gradient descent,” in Advances in Neural Information Processing Systems , pp. 3981–3989, 2016

work page 2016

[16] [16]

Neural architecture search with reinforcement learning,

B. Zoph and Q. V . Le, “Neural architecture search with reinforcement learning,” in Proceedings of International Conference on Learning Representations (ICLR), 2017

work page 2017

[17] [17]

Efﬁcient and robust automated machine learning,

M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, and F. Hutter, “Efﬁcient and robust automated machine learning,” in Advances in Neural Information Processing Systems , pp. 2962–2970, 2015

work page 2015

[18] [18]

Feature selection as a one-player game,

R. Gaudel and M. Sebag, “Feature selection as a one-player game,” in International Conference on Machine Learning , pp. 359–366, 2010

work page 2010

[19] [19]

Explorekit: Automatic feature generation and selection,

G. Katz, E. C. R. Shin, and D. Song, “Explorekit: Automatic feature generation and selection,” in Data Mining (ICDM), 2016 IEEE 16th International Conference on , pp. 979–984, IEEE, 2016

work page 2016

[20] [20]

Learning feature engineering for classiﬁcation,

F. Nargesian, H. Samulowitz, U. Khurana, E. B. Khalil, and D. Turaga, “Learning feature engineering for classiﬁcation,” in Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence, IJCAI, vol. 17, pp. 2529–2535, 2017

work page 2017

[21] [21]

Autolearnautomated feature generation and selection,

A. Kaul, S. Maheshwary, and V . Pudi, “Autolearnautomated feature generation and selection,” in Data Mining (ICDM), 2017 IEEE Inter- national Conference on , pp. 217–226, IEEE, 2017

work page 2017

[22] [22]

Stability selection,

N. Meinshausen and P. B ¨uhlmann, “Stability selection,” Journal of the Royal Statistical Society: Series B (Statistical Methodology) , vol. 72, no. 4, pp. 417–473, 2010

work page 2010

[23] [23]

Meta-learning by landmarking various learning algorithms.,

B. Pfahringer, H. Bensusan, and C. G. Giraud-Carrier, “Meta-learning by landmarking various learning algorithms.,” in ICML, pp. 743–750, 2000

work page 2000

[24] [24]

Learning curve prediction with bayesian neural networks,

A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter, “Learning curve prediction with bayesian neural networks,” 2016

work page 2016

[25] [25]

Neural networks for predicting algorithm runtime distributions.,

K. Eggensperger, M. Lindauer, and F. Hutter, “Neural networks for predicting algorithm runtime distributions.,” in IJCAI, pp. 1442–1448, 2018

work page 2018

[26] [26]

A comparison of ranking methods for classiﬁcation algorithm selection,

P. B. Brazdil and C. Soares, “A comparison of ranking methods for classiﬁcation algorithm selection,” inEuropean conference on machine learning, pp. 63–75, Springer, 2000

work page 2000

[27] [27]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997

work page 1997

[28] [28]

Learning to learn without gradient descent by gradient descent,

Y . Chen, M. W. Hoffman, S. G. Colmenarejo, M. Denil, T. P. Lillicrap, M. Botvinick, and N. de Freitas, “Learning to learn without gradient descent by gradient descent,” in Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pp. 748–756, JMLR. org, 2017

work page 2017

[29] [29]

Support-vector networks,

C. Cortes and V . Vapnik, “Support-vector networks,” Machine learn- ing, vol. 20, no. 3, pp. 273–297, 1995

work page 1995

[30] [30]

Simple and efﬁcient archi- tecture search for convolutional neural networks,

T. Elsken, J.-H. Metzen, and F. Hutter, “Simple and efﬁcient archi- tecture search for convolutional neural networks,” in Proceedings of International Conference on Learning Representations (ICLR) , 2018

work page 2018

[31] [31]

Large-scale evolution of image classiﬁers,

E. Real, S. Moore, A. Selle, S. Saxena, Y . L. Suematsu, J. Tan, Q. V . Le, and A. Kurakin, “Large-scale evolution of image classiﬁers,” in Proceedings of the 34th International Conference on Machine Learning (D. Precup and Y . W. Teh, eds.), vol. 70 of Proceedings of Machine Learning Research , (International Convention Centre, Sydney, Australia), pp. 29...

work page 2017

[32] [32]

Amc: Automl for model compression and acceleration on mobile devices,

Y . He, J. Lin, Z. Liu, H. Wang, L.-J. Li, and S. Han, “Amc: Automl for model compression and acceleration on mobile devices,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 784–800, 2018

work page 2018

[33] [33]

Analysis of the automl challenge series 2015-2018,

I. Guyon, L. Sun-Hosoya, M. Boull ´e, H. Escalante, S. Escalera, Z. Liu, D. Jajetic, B. Ray, M. Saeed, M. Sebag, et al., “Analysis of the automl challenge series 2015-2018,” 2017

work page 2015

[34] [34]

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

E. Brochu, V . M. Cora, and N. De Freitas, “A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning,” arXiv preprint arXiv:1012.2599, 2010

work page internal anchor Pith review Pith/arXiv arXiv 2010

[35] [35]

Sequential model-based optimization for general algorithm conﬁguration,

F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-based optimization for general algorithm conﬁguration,” in International Conference on Learning and Intelligent Optimization , pp. 507–523, Springer, 2011

work page 2011

[36] [36]

Using meta-learning to initialize bayesian optimization of hyperparameters,

M. Feurer, J. T. Springenberg, and F. Hutter, “Using meta-learning to initialize bayesian optimization of hyperparameters,” in Proceedings of the 2014 International Conference on Meta-learning and Algorithm Selection-Volume 1201, pp. 3–10, Citeseer, 2014

work page 2014

[37] [37]

Non-stochastic best arm identiﬁca- tion and hyperparameter optimization,

K. Jamieson and A. Talwalkar, “Non-stochastic best arm identiﬁca- tion and hyperparameter optimization,” in Artiﬁcial Intelligence and Statistics, pp. 240–248, 2016

work page 2016

[38] [38]

Population Based Training of Neural Networks

M. Jaderberg, V . Dalibard, S. Osindero, W. M. Czarnecki, J. Don- ahue, A. Razavi, O. Vinyals, T. Green, I. Dunning, K. Simonyan, et al., “Population based training of neural networks,” arXiv preprint arXiv:1711.09846, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[39] [39]

Gradient-based hyper- parameter optimization through reversible learning,

D. Maclaurin, D. Duvenaud, and R. Adams, “Gradient-based hyper- parameter optimization through reversible learning,” in International Conference on Machine Learning , pp. 2113–2122, 2015

work page 2015

[40] [40]

Banzhaf, P

W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone, Genetic pro- gramming: an introduction, vol. 1. Morgan Kaufmann San Francisco, 1998

work page 1998

[41] [41]

The kernel trick for distances,

B. Sch ¨olkopf, “The kernel trick for distances,” in Advances in neural information processing systems , pp. 301–307, 2001

work page 2001

[42] [42]

Atm: A distributed, collaborative, scalable system for automated machine learning,

T. Swearingen, W. Drevo, B. Cyphers, A. Cuesta-Infante, A. Ross, and K. Veeramachaneni, “Atm: A distributed, collaborative, scalable system for automated machine learning,” in IEEE International Con- ference on Big Data , 2017

work page 2017

[43] [43]

Practical automated machine learning for the automl challenge 2018,

M. Feurer, K. Eggensperger, S. Falkner, M. Lindauer, and F. Hutter, “Practical automated machine learning for the automl challenge 2018,” in International Workshop on Automatic Machine Learning at ICML , 2018

work page 2018

[44] [44]

Deep learning in the wild,

T. Stadelmann, M. Amirian, I. Arabaci, M. Arnold, G. F. Duivesteijn, I. Elezi, M. Geiger, S. L ¨orwald, B. B. Meier, K. Rombach, et al. , “Deep learning in the wild,” in IAPR Workshop on Artiﬁcial Neural Networks in Pattern Recognition , pp. 17–38, Springer, 2018

work page 2018

[45] [45]

Automating biomedical data science through tree-based pipeline optimization,

R. S. Olson, R. J. Urbanowicz, P. C. Andrews, N. A. Lavender, J. H. Moore, et al., “Automating biomedical data science through tree-based pipeline optimization,” in European Conference on the Applications of Evolutionary Computation, pp. 123–137, Springer, 2016

work page 2016

[46] [46]

Openml: Net- worked science in machine learning,

J. Vanschoren, J. N. van Rijn, B. Bischl, and L. Torgo, “Openml: Net- worked science in machine learning,” SIGKDD Explorations, vol. 15, no. 2, pp. 49–60, 2013

work page 2013

[47] [47]

Learning to Optimize

K. Li and J. Malik, “Learning to optimize,” arXiv preprint arXiv:1606.01885, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016