Deep learning on YARN: running distributed Tensorflow , MXNet , Caffe , and XGBoost on Hadoop clusters

Deep learning on YARN: running distributed Tensorflow , MXNet , Caffe , and XGBoost on Hadoop clusters

Wednesday, April 18
2:00 PM - 2:40 PM
Room II

Deep learning is useful for enterprises tasks in the field of speech recognition, image classification, AI chatbots, and machine translation, just to name a few.

In order to train deep learning/machine learning models, applications such as TensorFlow, MXNet, Caffe, and XGBoost can be leveraged. And sometimes these applications will be used together to solve different problems.

To make distributed deep learning/machine learning applications easily launched, managed, and monitored, we introduced, in Apache Hadoop 3.x, YARN native services along with other improvements such as first-class GPU support, container-DNS support, scheduling improvements, etc. These improvements make distributed deep learning/machine learning applications run on YARN as simple as running it locally, which can let machine learning engineers focus on algorithms instead of worrying about underlying infrastructure. Also, YARN can better manage a shared cluster which runs deep learning/machine learning and other services and ETL jobs with these improvements.

In this session, we will take a closer look at these improvements and show how to run these applications on YARN with demos. Audiences can start trying running these applications on YARN after this talk.

Presentation Video


Wangda Tan
Staff Software Engineer
Wangda Tan is Product Management Committee (PMC) member of Apache Hadoop and Staff Software Engineer at Hortonworks. His major working field is Hadoop YARN GPU isolation and resource scheduler, participated features like node labeling, resource preemption, container resizing etc. Before join Hortonworks, he was working at Pivotal, working on integration OpenMPI/GraphLab with Hadoop YARN. Before that, he was working at Alibaba cloud computing, participated creating a large scale machine learning, matrix and statistics computation platform using Map-Reduce and MPI.
Sunil Govindan
Staff Engineer
Sunil Govindan is contributing to Apache Hadoop project since 2013 in various roles as Hadoop Contributor, Hadoop Committer and member Project Management Committee (PMC). He is working as Staff Software Engineer at Hortonworks in YARN team. He is majorly contributing in YARN Scheduling improvements such as Intra-Queue Resource preemption, Multiple Resource types support in YARN with Resource Profiles, Absolute Resource configuration support in Queues etc. He also drove efforts to improve YARN UI for better user experience with community. Before Hortonworks, he worked at Juniper on a custom resource scheduler. Prior to that, he was associated with Huawei and worked on Platform and Middleware distributed systems including Hadoop platform. He loves reading books, an ardent music lover and passionate about go-green efforts.