
Abstract: Scikit-learn is a machine learning library in Python, that has become a valuable tool for many data science practitioners. This training will cover some advanced topics in using scikit-learn, such as how to perform out-of-core learning with scikit-learn and how to speed up parameter search. We'll also cover how to build your own models or feature extraction methods that are compatible with scikit-learn, which is important for feature extraction in many domains. We will see how we can customize scikit-learn even further, using custom methods for cross-validation or model evaluation.
This workshop assumes familiarity with Jupyter notebooks and basics of pandas, matplotlib and numpy. It also assumes experience using scikit-learn and familiarity with the API.
Bio: Andreas Mueller received his MS degree in Mathematics (Dipl.-Math.) in 2008 from the Department of Mathematics at the University of Bonn. In 2013, he finalized his PhD thesis at the Institute for Computer Science at the University of Bonn. After working as a machine learning scientist at the Amazon Development Center Germany in Berlin for a year, he joined the Center for Data Science at the New York University in the end of 2014. In his current position as assistant research engineer at the Center for Data Science, he works on open source tools for machine learning and data science. He is one of the core contributors of scikit-learn, a machine learning toolkit widely used in industry and academia, for several years, and has authored and contributed to a number of open source projects related to machine learning.

Andreas Mueller, PhD
Title
Author, Research Scientist, Core Contributor of scikit-learn | Columbia Data Science Institute
