Building a Production-level Data Pipeline Using Kedro

Abstract: 

The workshop is aimed at data scientists and data engineers who are interested in building a production-ready data pipelines.

We will go into:
- The challenges associated with creating ML models that are deployable
- Software engineering principles that should be applied to ML code to make it easier to deploy in the production environment
- How you can use an open-source Python library called [Kedro] (https://github.com/quantumblacklabs/kedro), to enhance your exploratory data analysis workflow as well as their transition to production-ready code.

Kedro is an open-source development workflow framework that implements software engineering best-practice for data pipelines with an eye towards productionising machine learning models.

Session Outline
Module 1: The emergence of MLOps and production-level data and ML pipelines
- Learn about the trends driving interest in production-level code data science code
- Get exposure to software principles data engineers and data scientists should consider applying to their code to make it easier to deploy into the production environment
- You will need a basic understanding of data science, this module is geared to beginners

Module 2: Overview of Kedro
- Learn what Kedro is by going through basic functionalities like the project template, configuration, data catalog and pipeline
- I'll show how it fits into the workflow for creating robust and reproducible data pipelines

Module 3: Short demo of building a data pipeline with Kedro
- A short demo for how to create a new Kedro project, build and visualize a data pipeline using an example dataset.

Background Knowledge
Basic knowledge of Python and some familiarity of Python data science libraries (e.g. Pandas, Jupyter notebook) is recommended.

Bio: 

Kiyo is a software engineer at QuantumBlack, an advanced analytics firm operating at the intersection of strategy, technology and design to improve performance outcomes for organizations. Kiyo is one of the core contributors and maintainers of Kedro, a Python library that implements software engineering best-practice for data and ML pipelines.

Kiyo holds MSc in Computing Science from Imperial College London, and MA in Economics from The University of Edinburgh.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google