A Birds of a Feather(BOF) is an informal discussion group. DataWorks will sponsor several Birds of Feather (BoFs) meeting groups, hosted by Apache Committers, architects, tech-leads, and engineers. Attendees group together based on a shared interest and carry out discussions without any pre-planned agenda. These groups will have hosts that will moderate the discussion.
Come to join the discussion and share your experiences, challenges, future interests, and requirements on key Apache and other open source projects and discuss what’s on the roadmap and future design options.
Date: Friday, November 9th
Room: Check agenda or check the DataWorks Summit Mobile App
Come learn and discuss the latest innovations and future direction in Apache Spark, Apache Zeppelin, and other ecosystem tools for Data Engineering and Data Science.
Hosts: Robert Hryniewicz
Apache Hadoop keeps evolving to meet the community demands around distributing computing and storage. Apache Hadoop has just released 3.1.x and quickly followed by 3.2.0 with key enhancements to YARN and HDFS.
Apache Hadoop YARN is the architectural center of Hadoop that allows multiple data processing engines to handle data stored in a single platform, unlocking an entirely new approach to analytics. Come learn and discuss the latest YARN innovations and future directions.
Apache Hadoop keeps evolving to meet the community demands around distributing computing and storage. Apache Hadoop has just released 3.0 and quickly followed by 3.1 with key enhancements to YARN and HDFS.
Apache Hive & Apache Druid
Apache Hive is the de facto standard for SQL queries in Hadoop. With the next phase of SQL in Hadoop, the Apache community has greatly improved Hive’s speed (LLAP), scale and SQL semantics. Come learn and discuss what is new in Hive 3.0.
Apache Druid is an open source column-oriented distributed data store designed for OLAP queries on event data. Druid provides the ability to have interactive queries on real-time streams that are horizontally scalable. Druid has rich client libraries and integration with tools like Pivot and Apache Superset. Come learn about the latest developments in Druid and Hive/Druid integration.
Host(s): Alan Gates
Cloud & Operations
Apache Ambari and Cloudbreak provide the foundation for Hadoop and Streaming platform installs, configurations and management on-premise and in the cloud. Come learn about the latest innovations and discuss Hadoop & Streaming platform operations and future directions.
Real-time data processing with Apache NiFi, Apache Kafka, Apache Storm, Apache Spark Streaming and many more provide the foundation for data processing in IoT. Come learn and discuss the latest streaming & data flow innovations and future directions.