Prepare Data Science/ML Pipelines with Ease, Speed Following Best Practices

Abstract: 

Existing ETL & MLOps tools claim to solve orchestration problems but no one does it the right way. In this hands-on workshop, we’ll go through a sample standard ML data pipeline, which represents the typical data science use case, extracting data from multiple data sources: DB and DWH, transforming it, viewing the data, and cleaning it. Then we’ll make sure it meets the quality standards and start training the model. During each of these phases, we will talk about testing (unit/integration tests). As a pre-pipeline step, we’ll talk about optional data preparation flows and talk about some strategies to accelerate the whole process by setting the quality gates, data testing, and some of the labeling services out there.

Session Outline

The learning outcomes are:
1. You’ll be able to build a pipeline in 45 minutes.
2. You’ll know what steps you need to incorporate so your pipeline follows best practices.
3. You’ll understand the background for creating Ploomber and when to use it.
4. Basic understanding of common pitfalls when working on data workflows.

Bio: 

Ido Michael co-founded Ploomber to help data scientists build faster. He'd been working at AWS leading data engineering/science teams. Single handedly he built 100’s of data pipelines during those customer engagements together with his team. He came to NY for his MS at Columbia University. He focused on building Ploomber after he constantly found that projects dedicated about 30% of their time just to refactor the dev work (prototype) into a production pipeline.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google