Jupyter Notebooks for Teams: Best Practices for Quality, Reproducibility, and Collaboration
Jupyter Notebooks for Teams: Best Practices for Quality, Reproducibility, and Collaboration

Abstract: 

Jupyter notebooks are a key tool for many data science teams. They allow for rapid prototyping, development, and sharing results with both technical and non-technical audiences. As a data science team grows, both in terms of individuals and work performed, Jupyter notebooks can become difficult to manage and keep clean. This talk describes several best practices for working with Jupyter notebooks on a data science team. It will cover:

- Writing and organizing code within a notebook for maximum reproducibility
- How to effectively manage with version control, view legible diffs, and perform code reviews
- Ways to implement quality checks via linting, pre-commit hooks, and integration tests
- Quick and simple ways to share content with non-technical audiences

We will showcase many of these best practices with notebooks that the data science team at Saturn Cloud uses every day.

Bio: 

Aaron Richter is a software developer turned data engineer and data scientist. He has pioneered the development and implementation of large-scale data science infrastructure in both business and research environments. Inevitably, he spent a lot of time finding efficient ways to clean data, run pipelines, and tune models. Aaron is currently a Senior Data Scientist at Saturn Cloud, where he works to make data scientists faster and happier. He holds a PhD in machine learning from Florida Atlantic University.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google