Data Science and Machine Learning At Scale


In this tutorial, you'll learn everything you wanted to know about scaling your data science work to larger datasets and larger models, while staying in the comfort of the PyData ecosystem (numpy, pandas, scikit-learn, Jupyter notebooks).

Session Outline
* How to reason about when you need to scale your data and machine learning work and when to not;
* How to leverage distribute computation on your local workstation (such as your laptop) to analyze larger datasets and build larger, more complex models;
* How to harness the power of clusters to support larger-than-memory computation, all from the comfort of your own laptop;
* How to do all of this while writing code similar to the numpy, pandas, and/or sckit-learn code you already write.


Pavithra is a software engineer and technical writer with over a year of experience in FOSS. She enjoys working at the intersection of technology and education, especially on outreach.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google