Scaling Machine Learning with Dask
Scaling Machine Learning with Dask


In this talk, attendees will get an introduction to Dask, a distributed computing framework in the PyData ecosystem.

The first half of the talk will describe the current state of the project and its ecosystem including distributed data collections, cloud deployment options, distributed machine learning projects, and workflow orchestration.

The second half of the talk will be a live demo showing the programming model for machine learning on Dask. Dask's potential for speeding up machine learning workflows will be demonstrated with an intermediate-level tutorial on training XGBoost and LightGBM models with Dask.


James Lamb is a software engineer at Saturn Cloud, where he works on a managed data science platform built on Dask and Kubernetes. Before Saturn Cloud, James worked on industrial internet of things (IIoT) problems as a data scientist at AWS and Chicago-based Uptake. He is a core maintainer on LightGBM, and has contributed on other open source data science and data engineering projects such as XGBoost and prefect. James holds Masters degrees in Applied Economics (Marquette University) and Data Science (University of California, Berkeley).

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google