Apache Hadoop YARN: state of the union

Apache Hadoop YARN: state of the union

Wednesday, April 18
11:00 AM - 11:40 AM

Apache Hadoop YARN is the modern distributed operating system for big data applications. It morphed the Hadoop compute layer to be a common resource management platform that can host a wide variety of applications. Many organizations leverage YARN in building their applications on top of Hadoop without themselves repeatedly worrying about resource management, isolation, multi-tenancy issues, etc.

In this talk, we’ll start with the current status of Apache Hadoop YARN—how it is used today in deployments large and small. We'll then move on to the exciting present and future of YARN—features that are further strengthening YARN as the first-class resource management platform for data centers running enterprise Hadoop.

We’ll discuss the current status as well as the future promise of features and initiatives like: powerful container placement, global scheduling, support for machine learning and deep learning workloads through GPU and FPGA support, extreme scale with YARN federation, containerized apps on YARN, support for long running services (alongside applications) natively without any changes, seamless application upgrades, powerful scheduling features like application priorities, intra-queue preemption across applications, and operational enhancements including insights through Timeline Service V2, a new web UI, and better queue management.

Presentation Video


Wangda Tan
Engineering Manager, YARN
Wangda Tan is Product Management Committee (PMC) member of Apache Hadoop and engineering manager of YARN team at Hortonworks. His major working field is Hadoop YARN GPU isolation and resource scheduler, participated features like node labeling, resource preemption, container resizing etc. Before join Hortonworks, he was working at Pivotal, working on integration OpenMPI/GraphLab with Hadoop YARN. Before that, he was working at Alibaba cloud computing, participated creating a large scale machine learning, matrix and statistics computation platform using Map-Reduce and MPI.
Billie Rinaldi
Principal Software Engineer I
Billie Rinaldi is a Principal Software Engineer I at Hortonworks, currently prototyping new features related to long-running services and containers in Apache Hadoop YARN. Prior to August 2012, Billie engaged in big data science and research at the National Security Agency, where she provided early leadership for Apache Accumulo. Billie is a member of the Apache Software Foundation and a committer for Apache Hadoop and a number of other Apache projects in the Hadoop ecosystem. She holds a Ph.D. in applied mathematics from Rensselaer Polytechnic Institute.