
Abstract: This tutorial is targeted towards Data Scientists and machine learning engineers who work on machine learning and deep learning models. Given a task, one is interested in finding a well-performing model to solve that task. Very often, this would involve tweaking the model either by changing the hyper parameters or modifying its architecture in order to find a better performing model. In the past, this was always done manually. But, with the advent of Automated Machine Learning, we can now leave that to the machines. In this tutorial, we will provide an overview of Hyperparameter Optimization (HPO) and Neural Architecture Search (NAS).
For HPO, we will introduce some common algorithms (Bayesian optimization, Hyperband) and tools (Ray, Hyperopt) that are used. For NAS, we will briefly touch upon some search algorithms and go into detail about evolutionary architecture search and building a surrogate model.
This tutorial also includes snippets of code written in Python using Ray and Pytorch frameworks. Codes will be uploaded on Github.
Session Outline:
Part 1: Neural Architecture Search:
We begin with an overview of Neural Architecture Search (NAS). This includes a quick primer on search space and various flavours of NAS algorithms. We will elaborate the specifics of evolutionary NAS and also describe how to accelerate NAS using a surrogate model and early stopping. We will then address how to add constraints such as latency, model size etc. to the search.
Part 2: Hyperparameter Optimization:
In this section, we will provide an introduction to Hyperparameter Optimization (HPO), Bayesian Optimization and Hyperband. We will demonstrate how to design a search space and apply these algorithms for HPO using Ray to run on a distributed cluster.
Background Knowledge:
The audience are required to know basics of machine learning and deep learning.
Bio: Tejaswini Pedapati works at IBM Research. Her research is focused on interpretability and automating deep learning. To that end, she was involved in developing tools and algorithms to provide these capabilities for IBM products. She has a masters’ degree from Columbia University.

Tejaswini Pedapati
Title
Research Engineer | IBM TJ Watson
