Building an Open Source Streaming Analytics Solution with Kafka and Druid
Building an Open Source Streaming Analytics Solution with Kafka and Druid

Abstract: 

The maturation and development of open source technologies has made it easier than ever for companies to derive insights from vast quantities of data. In this talk, we will cover how data analytic stacks have evolved from data warehouses, to data lakes, and to more modern streaming analytics stack. We will also discuss building such a stack using Apache Kafka and Apache Druid.

Analytics pipelines running purely on Hadoop can suffer from hours of data lag. Initial attempts to solve this problem often lead to inflexible solutions, where the queries must be known ahead of time, or fragile solutions where the integrity of the data cannot be assured. Combining Hadoop with Kafka and Druid can guarantee system availability, maintain data integrity, and support fast and flexible queries.

In the described system, Kafka provides a fast message bus and is the delivery point for machine-generated event streams. Kafka streams can be used to manipulated data to load into Druid. Druid provides flexible, highly available, low-latency queries.

This talk is based on our real-world experiences building out such a stack for many use cases across many industries in the real world.

Bio: 

Fangjin is a co-author of the open source Druid project and a co-founder of Imply, a San Francisco based technology company. Fangjin previously held senior engineering positions at Metamarkets and Cisco. He holds a BASc in Electrical Engineering and a MASc in Computer Engineering from the University of Waterloo, Canada.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google