
Abstract: This session is designed for data practitioners who wish to maintain control and confidence over their projects even after deployment in production.
We will explore two methods from the O'Reilly book ""Fundamentals of Data Observability"" that can be easily adopted to ensure the reliability of data pipelines throughout the whole process, from ingestion to analytics. The main outcome will be hints and tricks on automating the best practices and standardizing them across teams, people, and even organizations.
Session Outline:
In this talk, we will cover the following learning objectives:
- Identifying the characteristics that make a pipeline or any data application """"data observable""""
- Understanding how data practitioners can activate these characteristics with agents and collectors
- Recognizing the benefits for both data practitioners and final users.
Background Knowledge:
Python, Scala, R, Engineering best practices, data engineering/science/analytics experience
Bio: Andy Petrella is the CPO and founder of Kensu, a data observability solution that helps data teams trust what they deliver and create more value from data.
Andy is an entrepreneur with a background in data mining, data engineering, and data science. He is known as an early evangelist of Apache Spark and the Spark Notebook creator in the data community.
Since 2015, Andy has been an O'Reilly instructor and author, including the first O’Reilly book about Data Observability: “Fundamentals of Data Observability"