Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager

Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager

Wednesday, June 20
4:40 PM - 5:30 PM
Executive Ballroom 210A/E

Running scheduled, long-running or repetitive workflows on Hadoop clusters, especially secure clusters, is the domain of Apache Oozie. Oozie, however, suffers from XML for job configuration and a dated UI -- very bad usability in all. Apache Ambari, in its quest to make cluster management easier, has branched out to offering views for user services. This talk covers the Ambari Workflow Manager view which provides a GUI to author and visualize Oozie jobs.

To provide an example of Workflow Manager, Oozie jobs for log management and HBase compactions will be demonstrated showing off how easy Oozie can now be and what the exciting future for Oozie and Workflow Manager holds.

Apache Oozie is the long-time incumbent in big data processing. It is known to be hard to use and the interface is not aesthetically pleasing -- Oozie suffers from a dated UI. However, for secure Hadoop clusters, Oozie is the most readily available, obvious and full featured solution.

Apache Ambari is a deployment and configuration management tool used to deploy Hadoop clusters. Ambari Workflow Manager is a new Ambari view that helps address the usability and UI appeal of Apache Oozie.

In this talk, we’re going to leverage the stable foundation of Apache Oozie and clarity of Workflow Manager to demonstrate how one can build powerful batch workflows on top of Apache Hadoop. We’re also going to cover future roadmap and vision for both Apache Oozie and Workflow Manager. We will finish off with a live demo of Workflow Manager in action.

Presentation Video


Artem Ervits
Solutions Engineer
Artem Ervits is a Solutions Engineer at Hortonworks. Hortonworks is a leading big data software company based in Santa Clara, California. The company develops and supports Apache Hadoop, for the distributed processing of large data sets across computer clusters. Artem is an organizer of the NYC Future of Data Meetup and contributor to Apache Oozie. He works with Workflow Manager and Oozie product management and engineering teams to shape the future direction for Workflow Manager and Oozie. You may reach him with questions on Oozie, HBase, Phoenix, Pig and Hive.
Clay Baenziger
Hadoop Infrastructure
Clay Baenziger - is an architect of the Hadoop Infrastructure Team at Bloomberg. Clay comes from a diverse background in systems infrastructure and analytics ranging from operating systems engineering to financial portfolio analytics. He has been involved in the Hadoop ecosystem for nine years and provides numerous talks each year on Bloomberg's community contributions.