Ozone and HDFS’s Evolution

Wednesday, February 6
2:50 PM - 3:30 PM
Room 101/102

HDFS has several strengths: horizontally scale its IO bandwidth and scale its storage to petabytes of storage. Further, it provides very low latency metadata operations and scales to over 60K concurrent clients. Hadoop 3.0 recently added Erasure Coding. One of HDFS’s limitations is scaling a number of files and blocks in the system. We describe a radical change to Hadoop’s storage infrastructure with the upcoming Ozone technology. It allows Hadoop to scale to tens of billions of files and blocks and, in the future, to every larger number of smaller objects. Ozone fundamentally separates the namespace layer and the block layer allowing new namespace layers to be added in the future. Further, the use of RAFT protocol has allowed the storage layer to be self-consistent. We show how this technology helps a Hadoop user and also what it means for evolving HDFS in the future. We will also cover the technical details of Ozone.


Sanjay Radia
Chief Architect, Founder
Sanjay is founder and chief architect at Hortonworks, and an Apache Hadoop committer and member of the Apache Hadoop PMC. Prior to co-founding Hortonworks, Sanjay was the chief architect of core-Hadoop at Yahoo and part of the team that created Hadoop. In Hadoop he has contributed to several areas including HDFS, MapReduce schedulers, Yarn's design, high availability, compatibility, etc. He has also held senior engineering positions at Sun Microsystems and INRIA, where he developed software for distributed systems and grid/utility computing infrastructures. Sanjay has a PhD in Computer Science from the University of Waterloo in Canada.