Interactive real time dashboards on data streams using Kafka, Druid, and Superset

Interactive real time dashboards on data streams using Kafka, Druid, and Superset

Wednesday, April 18
2:50 PM - 3:30 PM
Room IV

When interacting with analytics dashboards, in order to achieve a smooth user experience, two major key requirements are quick response time and data freshness. To meet the requirements of creating fast interactive BI dashboards over streaming data, organizations often struggle with selecting a proper serving layer.

Cluster computing frameworks such as Hadoop or Spark work well for storing large volumes of data, although they are not optimized for making it available for queries in real time. Long query latencies also make these systems suboptimal choices for powering interactive dashboards and BI use cases.

This talk presents an open source real time data analytics stack using Apache Kafka, Druid, and Superset. The stack combines the low-latency streaming and processing capabilities of Kafka with Druid, which enables immediate exploration and provides low-latency queries over the ingested data streams. Superset provides the visualization and dashboarding that integrates nicely with Druid. In this talk we will discuss why this architecture is well suited to interactive applications over streaming data, present an end-to-end demo of complete stack, discuss its key features, and discuss performance characteristics from real-world use cases.

Presentation Video

SPEAKERS

Nishant Bangarwa
Software engineer
Hortonworks
Nishant is Druid PMC member and Software Engineer at Hortonworks. He is part of Business Intelligence team at Hortonworks. Prior to that he was part of Metamarkets backend team and was responsible for analytics infrastructure, including real-time analytics in Druid. He holds a B.Tech in Computer Science from National Institute of Technology, Kurukshetra, India.