Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop

Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop

Tuesday, June 19
11:50 AM - 12:30 PM
Executive Ballroom 210C/G

Deep learning techniques are finding significant commercial success in a wide variety of industries. Large unstructured data sets such as images, videos, speech and text are great for deep learning, but impose a lot of demands on computing resources. New types of hardware architectures such as GPUs and faster interconnects (e.g. NVLink), RDMA capable networking interface from Mellanox available on OpenPOWER and IBM POWER systems are enabling practical speedups for deep learning. Data Scientists can intuitively incorporate deep learning capabilities on accelerated hardware using open source components such as Jupyter and Zeppelin notebooks, RStudio, Spark, Python, Docker, and Kubernetes with IBM PowerAI. Jupyter and Apache Zeppelin integrate well with Apache Spark and Hadoop using the Apache Livy project. This session will show some deep learning build and deploy steps using Tensorflow and Caffe in Docker containers running in a hardware accelerated private cloud container service. This session will also show system architectures and best practices for deployments on accelerated hardware.

SPEAKERS

Indrajit Poddar
Senior Technical Staff Member, IBM Cognitive Systems
IBM
Indrajit Poddar is a Senior Technical Staff Member and Master Inventor in IBM Systems. He currently works on cloud enabled and hardware accelerated machine learning and deep learning software. He has 18 years of industry experience in distributed computing and holds a MS in CS from Penn State University and B.Tech in CSE from the Indian Institute of Technology, Kharagpur.