LLAP: great on premises and great in the cloud

Thursday, April 19
2:00 PM - 2:40 PM
Europe

LLAP (low latency analytical processing) has been a significant architectural advancement for improving Hive query latencies, showing significant improvements in self-managed Hadoop cluster deployments. Did you know that the benefits of LLAP also work well in cloud-provider hosted IaaS offerings?

This talk presents a performance study of several common query patterns executing in Amazon Elastic MapReduce. We repeated multiple trials of the same query patterns executing in Hive (without LLAP), Spark, and Hive (with LLAP). For our usage patterns, Hive with LLAP provided the best overall query performance. This also translated to needing fewer EC2 instances in each EMR cluster to hit our performance goals giving us a bottom-line improvement in AWS billing. Attendees will learn how to configure LLAP in EMR clusters and see the full results of the performance analysis. We will share observations about scalability trends in relation to EMR cluster sizing and potential benchmarking pitfalls, which could be applicable to your own performance analysis efforts.

SPEAKERS

Chris Nauroth
Senior Staff Software Architect
The Walt Disney Company
Chris Nauroth is an Apache Software Foundation member who has worked as a committer and PMC member for Hadoop, ZooKeeper and Yetus. He is a software architect at Disney working on shared services for large-scale consumer messaging and content management. His responsibilities include evaluation and technology selection for big data solutions.