Ozone: scaling HDFS to trillions of objects

Ozone: scaling HDFS to trillions of objects

Wednesday, June 20
11:00 AM - 11:40 AM
Meeting Room 211A/B/C/D

Ozone is an object store for Hadoop. Ozone solves the small file problem of HDFS, which allows users to store trillions of files in Ozone and access them as if there are on HDFS. Ozone plugs into existing Hadoop deployments seamlessly, and programs like Hive, LLAP, and Spark work without any modifications. This talk looks at the architecture, reliability, and performance of Ozone.

In this talk, we will also explore Hadoop distributed storage layer, a block storage layer that makes this scaling possible, and how we plan to use the Hadoop distributed storage layer for scaling HDFS.

We will demonstrate how to install an Ozone cluster, how to create volumes, buckets, and keys, how to run Hive and Spark against HDFS and Ozone file systems using federation, so that users don’t have to worry about where the data is stored. In other words, a full user primer on Ozone will be part of this talk.

SPEAKERS

Anu Engineer
Software Engineer
Hortonworks
Anu Engineer was part of the original Windows Azure team, principal author of VMware Certificate Authority, Apache Hadoop committer and PMC member. He works on HDFS and is one of the contributors to ozone.
Xiaoyu Yao
Software Engineer
Hortonworks
Xiaoyu yao is Apache Hadoop PMC & Committer working mainly on Hadoop HDFS and Ozone.