Graphene – Microsoft SCOPE on Tez

Tuesday, June 19
2:00 PM - 2:40 PM
Executive Ballroom 210A/E

Microsoft has embraced OSS by placing a big bet on Apache YARN to govern the resources of our computing clusters, and we did so by working with the community and adding many new capabilities in YARN. We now look to undertake a similar journey and build the next generation of our job execution engine on top of Apache Tez. We will be building a common platform for executing batch, interactive, ML, and streaming queries at exabyte scale for Microsoft's BigData system called Cosmos. This requires us to push the limits of Tez API to support new graph models, change the executing DAG by dynamically adding new vertices, scheduling for interactive and streaming workloads, squeeze out all the computing power in the cluster by integrating Tez with opportunistic containers in YARN, and scaling a DAG across tens of thousands of machines. We have started out on this journey and want to share our progress, lessons learned, seek help from the community to add these new capabilities, and push Apache Tez to new levels.

Presentation Video

SPEAKERS

Hitesh Sharma
Principal Software Engineering Manager
Microsoft
Engineering manager in the Big Data team at Microsoft.
Anupam .
Senior Software Engineer
Microsoft
Software engineer in the Big Data team at Microsoft.