Abstract: Developing well-performing machine learning pipelines requires a lot of expertise, time and manual tuning. AutoML automates the development process to make machine learning more accessible and efficient. In this workshop we will cover how to move from manually constructing and tuning machine learning pipelines to using efficient hyperparameter optimization algorithms and full AutoML using the popular Auto-sklearn library. After this tutorial you will be able to use Auto-sklearn to build machine learning pipelines for tabular datasets and analyze them.
Part 1: Why is finding the right pipeline hard and how to search efficiently?
You will learn about the vast design space of machine learning pipelines including seemingly subtle design choices with a huge impact. As a first step we will use simple hyperparameter optimization methods and show when and how they break as soon as the problems get more complicated.
Part 2: AutoML methods
Next, we will discuss advanced hyperparameter optimization methods such as Bayesian optimization and Hyperband and use them to optimize larger numbers of hyperparameters for machine learning models with longer runtimes. Based on this knowledge we will introduce Auto-sklearn, one of the leading open source AutoML libraries, and leverage it to simplify the machine learning workflow.
Part 3: Advanced use cases of Auto-sklearn
In the last session we will demonstrate advanced use cases of Auto-sklearn such as inspecting the final model, obtaining model-independent feature importance measures, restricting Auto-sklearn to use only interpretable models and extending Auto-sklearn with additional components from 3rd-party libraries.
Basic knowledge in machine learning and model selection strategies. To understand code examples, knowledge in Python and scikit-learn is recommended. This tutorial is a practical complement of the talk by Prof. Dr. Frank Hutter in the main track, attending it is a plus.
Bio: Katharina Eggensperger is a doctoral candidate at the Machine Learning Lab at the University of Freiburg, Germany. Her research focuses on empirical performance modeling, automated machine learning and hyperparameter optimization. She has been an invited speaker at the BayesOpt workshop at NeurIPS 2016 and co-organized the AutoML workshop in 2019 and 2020. Furthermore, she was part of the winning team of the 1st&2nd AutoML challenges and the BBO challenge@NeurIPS 2020.