A Birds of a Feather(BOF) is an informal discussion group. DataWorks will sponsor several Birds of Feather (BoFs) meeting groups, hosted by Apache Committers, architects, tech-leads, and engineers.  Attendees group together based on a shared interest and carry out discussions without any pre-planned agenda. These groups will have hosts that will moderate the discussion.

Come join the discussion and share your experiences, challenges, future interests, and requirements on key Apache and other open source projects and discuss what’s on the roadmap and future design options.

Date: Wed, June 20th
Time: 5:40 PM
Location: San Jose Convention Center
Room: Check agenda or Check DataWorks Summit Mobile App

Apache Spark, Apache Zeppelin & Data Science

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets. Come learn and discuss Spark, Apache Zeppelin, Data Science, Deep Learning innovations and future directions.


Hosts: Robert Hryniewicz, Sriram Srinivasan, Saumitra Buragohain

Location: Grand Ballroom 220A

Apache Hive & Apache Druid

Apache Hive is the de facto standard for SQL queries in Hadoop. With the next phase of SQL in Hadoop, the Apache community has greatly improved Hive’s speed(LLAP), scale and SQL semantics.  Come learn and discuss what is new in Hive 3.0.

Apache Druid is an open source column-oriented distributed data store designed for OLAP queries on event data.   Druid provides the ability to have interactive queries on real-time streams that are horizontally scalable.   Druid has rich client libraries and integration with tools like Pivot and Apache Superset.   Come learn about the latest developments in Druid and Hive/Druid integration.


Hosts: Alan Gates, Vihang Karajgaonkar, Jesus Camacho Rodriguez, Nishant Bangarwa

Location: Grand Ballroom 220B

Apache HBase & Apache Phoenix

Apache HBase is the NoSQL store that runs on Apache Hadoop.  Apache Phoenix provides a SQL skin on top of HBase.

Come learn and discuss HBase 2.0 along with the latest Phoenix developments in Phoenix 5.0.


Hosts: Josh Elser, Artem Ervits

Location: Grand Ballroom 220C

IoT, Streaming & Data Flow

Real-time data processing with Apache NiFi, Apache Kafka, Apache Storm, Apache Spark Streaming and many more provide the foundation for data processing in IoT. Come learn and discuss the latest streaming & data flow innovations and future directions.


Hosts: Aldrin Piri, Davor Bonaci, Venkatesh Ramanathan, Jeremy Dyer

Location: Meeting Room 230A

Apache Hadoop – YARN

Apache Hadoop keeps evolving to meet the community demands around distributing computing and storage.  Apache Hadoop has just released 3.0 and quickly followed by 3.1 with key enhancements to YARN and HDFS.

Apache Hadoop YARN is the architectural center of Hadoop that allows multiple data processing engines to handle data stored in a single platform, unlocking an entirely new approach to analytics. Come learn and discuss the latest YARN innovations and future directions.


Hosts: Vinod Kumar Vavilapalli, Wangda Tan

Location: Meeting Room 211A/B/C/D

Apache Hadoop – HDFS

Apache Hadoop keeps evolving to meet the community demands around distributing computing and storage.  Apache Hadoop has just released 3.0 and quickly followed by 3.1 with key enhancements to YARN and HDFS.

Apache Hadoop HDFS is a distributed Java-based file system for storing large volumes of data. Come learn and discuss the latest HDFS and Ozone innovations and future directions.


Hosts: Anu Engineer, Jitendra Pandey, Chris Douglas

Location: Executive Ballroom 210A/E

Cybersecurity and Apache Metron

Apache Metron is a new top-level Apache project focused on open source big data cyber security analytics platform supporting real-time ingest and analytics to discover information security threats and build out a high-value security data lake. Apache Metron helps security operations teams be more efficient by reducing the amount of “DIY” big data and data science tooling necessary to detect threats in real time.

Come learn and discuss the latest Metron innovations and future directions.


Hosts: Simon Elliston Ball, Casey Stella, Carolyn Duby

Location: Executive Ballroom 210D/H

Cloud & Operations

Apache Ambari and Cloudbreak provide the foundation for Hadoop and Streaming platform installs, configurations and management on-premise and in the cloud. Come learn about the latest innovations and discuss Hadoop & Streaming platform operations and future directions.


Hosts: Paul Codding, Jayush Luniya, Jeff Sposetti, David Lyle, Aaron Wiebe, Jon Dybik

Location: Executive Ballroom 210C/G

Security and Governance

Apache Knox and Apache Ranger provide Hadoop security while Apache Atlas provides a Hadoop metadata store and enterprise compliance. Come learn and discuss security & governance innovations and future directions.


Hosts: Srikanth Venkat, Don Bosco Durai

Location: Executive Ballroom 210B/F