Abstract: In this tutorial, you'll learn everything you wanted to know about scaling your data science work to larger datasets and larger models, while staying in the comfort of the PyData ecosystem (numpy, pandas, scikit-learn, Jupyter notebooks).
* How to reason about when you need to scale your data and machine learning work and when to not;
* How to leverage distribute computation on your local workstation (such as your laptop) to analyze larger datasets and build larger, more complex models;
* How to harness the power of clusters to support larger-than-memory computation, all from the comfort of your own laptop;
* How to do all of this while writing code similar to the numpy, pandas, and/or sckit-learn code you already write.
Bio: Pavithra is a software engineer and technical writer with over a year of experience in FOSS. She enjoys working at the intersection of technology and education, especially on outreach.