Apache Hadoop YARN 3.x in Alibaba

Thursday, June 21
10:20 AM - 11:00 AM
Meeting Room 211A/B/C/D

Alibaba builds the data infrastructure with Apache Hadoop YARN since 2013, and till now it manages more than 10k nodes. In Alibaba, Hadoop YARN serves various systems such as search, advertising, and recommendation etc. It runs not just batch jobs, also streaming, machine learning, OLAP, and even online services that directly impact Alibaba’s user experience. To extend YARN’s ability to support such complex scenarios, we have done and leveraged a lot of YARN 3.x improvements. In this talk, you will find what are these improvements and how they helped to solve difficult problems in large production clusters.

This includes:
1. Highly improved performance with Capacity Scheduler’s async scheduling framework
2. Better placement decisions with node attributes, placement constraints
3. Better resource utilization with opportunistic containers
4. Introduce a load balancer to balance resource utilization
5. Generic resource types scheduling/isolation to manage new resources such as GPU and FPGA

In the presentation, we will further introduce how we build the entire ecosystem on top of YARN and how we keep evolving YARN’s ability to tackle the challenges brought by continuously increasing data and business in Alibaba.

SPEAKERS

Weiwei Yang
Staff Software Engineer
Alibaba
Weiwei Yang is a staff software engineer in Alibaba, he is focusing on evolving data infrastructure to serve large scale data processing in Alibaba. He started working on Hadoop ecosystem since 2010. He is very passionate on Open Source contributions, and an active Hadoop committer. He works on both HDFS and YARN projects on various of improvements to shape it be better fit for internet scale and use cases. Prior to this, he has worked for IBM for more than 6 years, and one of the startup member of Biginsights product. He got his master degree from Peking University and bachelor from Wuhan University, China.
Ren Chunde
Engineering Manager
Alibaba
Chunde Ren is a engineering manager, leading the development of Apache YARN in Alibaba.