Apache Hadoop YARN 3.x in Alibaba

Thursday, June 21
10:20 AM - 11:00 AM
Meeting Room 211A/B/C/D

Alibaba builds the data infrastructure with Apache Hadoop YARN since 2013, and till now it manages more than 10k nodes. In Alibaba, Hadoop YARN serves various systems such as search, advertising, and recommendation etc. It runs not just batch jobs, also streaming, machine learning, OLAP, and even online services that directly impact Alibaba’s user experience. To extend YARN’s ability to support such complex scenarios, we have done and leveraged a lot of YARN 3.x improvements. In this talk, you will find what are these improvements and how they helped to solve difficult problems in large production clusters.

This includes:
1. Highly improved performance with Capacity Scheduler’s async scheduling framework
2. Better placement decisions with node attributes, placement constraints
3. Better resource utilization with opportunistic containers
4. Introduce a load balancer to balance resource utilization
5. Generic resource types scheduling/isolation to manage new resources such as GPU and FPGA

In the presentation, we will further introduce how we build the entire ecosystem on top of YARN and how we keep evolving YARN’s ability to tackle the challenges brought by continuously increasing data and business in Alibaba.

SPEAKERS

Weiwei Yang
Staff Software Engineer
Hortonworks
Weiwei is a Staff Engineer working at Hortonworks, a Apache Hadoop committer and PMC member. He has been working on Hadoop for over 8 years, and contributed to both HDFS and YARN. His work mainly includes storage features like Ozone metadata store, garbage collection, and scheduling features like YARN placement constraints, async scheduling and CSI adoption etc. Before Hortonworks, he worked in Alibaba’s data infrastructure team, with experiences of evolving big data platform at 10k+ nodes scale. Prior to that, he worked in IBM for several years as one of the startup member of Biginsights project.
Ren Chunde
Engineering Manager
Alibaba
Chunde Ren is a engineering manager, leading the development of Apache YARN in Alibaba.