Foundations of streaming SQL: stream & table theory

Foundations of streaming SQL: stream & table theory

Thursday, June 21
11:30 AM - 12:10 PM
Meeting Room 230A

What does it mean to execute streaming queries in SQL? What is the relationship of streaming queries to classic relational queries? Are streams and tables the same thing? And how can all of this work in a programmatic framework like Apache Beam? The presentation answers these questions and more as it walks you through key concepts underpinning data processing in general.

Presentation explores the relationship between the Beam model (as described in paper “The Dataflow Mode”and the “Streaming 101”and “Streaming 102” blog posts) and stream and table theory (as popularized by Martin Kleppmann and Jay Kreps, among others).

It turns out that stream and table theory does an illuminating job of describing the low-level concepts that underlie the Beam model.

The presentation explains what is required to provide robust stream processing support in SQL and discusses the concrete efforts that have been made in this area by the Apache Beam, Calcite, and Flink communities, as well as new ideas yet to come. You’ll leave with a much better understanding of the key concepts underpinning data processing—regardless of whether that data processing is batch or streaming or SQL or programmatic—as well as a concrete notion of what robust stream processing in SQL looks like.

Presentation Video


Anton Kedin
Software Engineer
Software engineer at Google Seattle, working on Cloud Dataflow SDK, focusing on streaming SQL support for Apache Beam.