Build an ML Pipeline with Airflow and Kubernetes


Airflow is a leading open-source workflow orchestrator that offers a very wide range of possibilities. It can be integrated with Kubernetes using the KubernetesPodOperator to create pipelines that are extremely customizable. This is what we use to preprocess data and train ML models for PowerOP, Dataswati's SAAS for optimizing food industry production lines.

Session Outline
Part 1: Installation
In this part, you will install MicroK8s on your computer to simulate a Kubernetes cluster. We will then use a Helm chart to deploy Airflow on MicroK8s and retrieve the code repository.

Part 2: Machine Learning
You will discover some simple data processing and machine learning code and how it can be deployed in an Airflow DAG using the KubernetesPodOperator.

Part 3: Play with Airflow
You will understand how to trigger a DAG and pass parameters, and how to schedule it to be run regularly and how to navigate through the airflow UI.

Background Knowledge
Python, Basic Machine Learning, Software installation with command line.


Luis spent his education and early career in the realm of Transport (Civil Engineering Diploma, MSc Transport from Imperial College London, French Civil Aviation Authority) but he took the train of data science and machine learning in 2014 with Kaggle and Coursera as teachers. He spent about 3 years as a Data Scientist at Quantmetry, a Data Science and AI Consultancy based in Paris where he helped several big companies in multiple sectors develop solutions involving Natural Language Processing, Geospatial data, and Time series. He has recently joined Dataswati to take the lead of the Data Science team. At Dataswati, they develop a SAAS called PowerOP that offers agro-industrial companies easy integration and visualization of their production data as well as explainable and actionable AI services like quality analysis, smart alerting and recipe or settings recommendation.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google