Lessons learned processing 70 billion data points a day using the hybrid cloud

Lessons learned processing 70 billion data points a day using the hybrid cloud

Tuesday, June 19
11:50 AM - 12:30 PM
Executive Ballroom 210D/H

NetApp receives 70 billion data points of telemetry information each day from its customer’s storage systems. This telemetry data contains configuration information, performance counters, and logs. All of this data is processed using multiple Hadoop clusters, and feeds a machine learning pipeline and a data serving infrastructure that produces insights for customers via an application called Active IQ. We describe the evolution of our Hadoop infrastructure from a traditional on-premises architecture to the hybrid cloud, and lessons learned.
We’ll discuss the insights we are able to produce for our customers, and the techniques used. Finally, we describe the data management challenges with our multi-petabyte Hadoop data lake. We solved these problems by building a unified data lake on-premises and using the NetApp Data Fabric to seamlessly connect to public clouds for data science and machine learning compute resources.
Architecting a truly hybrid cloud implementation allowed NetApp to free up our data scientists to use any software on any cloud, kept the customer log data safe on NetApp Private Storage in Equinix, resulted in faster ability to innovate and release new code and provided flexibility to use any public cloud at the same time with data on NetApp in Equinix.

SPEAKERS

Pranoop Erasani
Senior Technical Director, ONTAP
NetApp
Pranoop Erasani is a Senior Technical Director for world’s No.1 storage operating system, NetApp® ONTAP®. NetApp is widely recognized as the market leader in storage software solutions. Since joining NetApp, Pranoop has led technology innovations related to areas of NAS protocol, Scalable filesystem, Analytics and Caching technology for the NetApp flagship ONTAP operating system. He is a passionate about clustering and distributed systems and is a strong advocate of leveraging NAS for analytics storage. He acts as a key technical advisor to technical marketing and product management for design and development of technologies required to make ONTAP a successful data management platform for Hadoop, NoSQL and Machine Learning. Prior to NetApp, Pranoop worked at Sun Microsystems on the Solaris Clustering product. Pranoop holds a Master’s degree in computer science from the University of Minnesota, Minneapolis.
Shankar Pasupathy
Technical Director, Data Science and Engineering
NetApp
Shankar Pasupathy is NetApp’s Technical Director for Active IQ, a telemetry system that provides data driven insights for NetApp’s customers. He leads both the data science and data engineering teams that are responsible for processing and deriving insights from 70 billion data points a day. Shankar also drives strategy for the company around analytics and machine learning in the hybrid cloud. Previously Shankar was a senior manager and principal architect in NetApp’s Advanced Technology Group. He has published more than 20 research papers, is a co-author on 40 patents, and has won several NetApp innovation awards. He has a Masters in Computer Science from the University of Wisconsin-Madison and dual degrees in Math and Computer Science from BITS, India.