Engineering a Performant Machine Learning Pipeline: From Dask to Kubeflow

Abstract: 

The lifecycle of any machine learning model, regular or deep, consists of (a) the pre-processing/transformation/augmenting of data (b) the training of the model with different hyper-parameter values/learning rates (c) the computing of results on new data/test sets. Whether you are using transfer learning, or a from-scratch model, this process requires a large amount of computation, management of your experimental process, and the quick perusal of results from your experiment. In this workshop, we will learn how to combine off-the shelf clustering software such as kubernetes and dask, with learning systems such as tensorflow/pytorch/scikit-learn, on cloud infrastructure such as AWS/Google Cloud/Azure to construct a machine-learning system for your data science team. We'll start with an understanding of kubernetes, move onto analysis pipelines in sklearn and dask, finally arriving at kubeflow. Participants should install minikube on their laptops (https://kubernetes.io/docs/tasks/tools/install-minikube/), and create accounts on the Google Cloud.

Bio: 

Rahul Dave is a lecturer in Bayesian Statistics and Machine Learning at Harvard University, and consults on the same topics at LxPrior. He holds a Ph.D. from the University of Pennsylvania in Computational Astrophysics, and has programmed device drivers for telescopes, bespoke databases for astrophysical data, and machine learning systems in various fields. His new startup, univ.ai, helps students and companies upgrade the skill and understanding of both their developers and managers for this new AI driven world, by providing both corporate training and consulting.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google