A Birds of a Feather(BOF) is an informal discussion group. DataWorks will sponsor several Birds of Feather (BoFs) meeting groups, hosted by Apache Committers, architects, tech-leads, and engineers. Attendees group together based on a shared interest and carry out discussions without any pre-planned agenda. These groups will have hosts that will moderate the discussion.
Come to join the discussion and share your experiences, challenges, future interests, and requirements on key Apache and other open source projects and discuss what’s on the roadmap and future design options.
Date: Wednesday, May 22
Room: Check agenda or check the DataWorks Summit Mobile App
Come learn and discuss the latest innovations and future direction in Apache Spark, Apache Zeppelin, and other ecosystem tools for Data Engineering and Data Science.
Apache Hadoop keeps evolving to meet the community demands around distributing computing and storage. Apache Hadoop has just released 3.0 and quickly followed by 3.1 with key enhancements to YARN and HDFS.
Apache Hadoop YARN is the architectural center of Hadoop that allows multiple data processing engines to handle data stored in a single platform, unlocking an entirely new approach to analytics. Come learn and discuss the latest YARN innovations and future directions.
Apache Phoenix enables OLTP and operational analytics in Hadoop for low latency applications by combining the best of both worlds:
Apache Hive & Apache Druid
Apache Hive is the de facto standard for SQL queries in Hadoop. With the next phase of SQL in Hadoop, the Apache community has greatly improved Hive’s speed (LLAP), scale and SQL semantics. Come learn and discuss what is new in Hive 3.0.
Apache Druid is an open source column-oriented distributed data store designed for OLAP queries on event data. Druid provides the ability to have interactive queries on real-time streams that are horizontally scalable. Druid has rich client libraries and integration with tools like Pivot and Apache Superset. Come learn about the latest developments in Druid and Hive/Druid integration.
Cloud & Operations
Apache Ambari and Cloudbreak provide the foundation for Hadoop and Streaming platform installs, configurations and management on-premise and in the cloud. Come learn about the latest innovations and discuss Hadoop & Streaming platform operations and future directions.
Real-time data processing with Apache NiFi, Apache Kafka, Apache Storm, Apache Spark Streaming and many more provide the foundation for data processing in IoT. Come learn and discuss the latest streaming & data flow innovations and future directions.
Apache Knox and Apache Ranger provide security across the big data ecosystem while Apache Atlas provides an open source framework for metadata and enterprise governance and Data Steward Studio provides an open source based stewardship experience for users. Come to learn, discuss, and share your experience and insights on the innovations in security & governance in the open source communities that can help in the age of regulations like GDPR, CCPA, various national privacy acts, and how such approaches can address compliance, industry regulations, and standards across various industries both currently and looking out in the future.