Designing data pipelines for analytics and machine learning in industrial settings

Designing data pipelines for analytics and machine learning in industrial settings

Thursday, June 21
12:20 PM - 1:00 PM
Meeting Room 230A

Machine learning has made it possible for technologists to do amazing things with data. Its arrival coincides with the evolution of networked manufacturing systems driven by IoT. In this presentation we’ll examine the rise of IoT and ML from a practitioners perspective to better understand how applications of AI can be built in industrial settings. We'll walk through a case study that combines multiple IoT and ML technologies to monitor and optimize an industrial heating and cooling HVAC system. Through this instructive example you'll see how the following components can be put into action:
1. A StreamSets data pipeline that sources from MQTT and persists to OpenTSDB
2. A TensorFlow model that predicts anomalies in streaming sensor data
3. A Spark application that derives new event streams for real-time alerts
4. A Grafana dashboard that displays factory sensors and alerts in an interactive view

By walking through this solution step-by-step, you'll learn how to build the fundamental capabilities needed in order to handle endless streams of IoT data and derive ML insights from that data:
1. How to transport IoT data through scalable publish/subscribe event streams
2. How to process data streams with transformations and filters
3. How to persist data streams with the timeliness required for interactive dashboards
4. How to collect labeled datasets for training machine learning models

At the end of this presentation you will have learned how a variety of tools can be used together to build ML enhanced applications and data products for instrumented manufacturing systems.

Presentation Video


Ian Downard
Sr. Developer Evangelist
Ian Downard is a senior developer evangelist and open source ambassador at MapR. He frequently publishes screencasts and technical articles relating to machine learning and data science on the MapR blog and on his personal blog at He enjoys connecting with people at meetups and leads the Java User Group in Portland, Oregon.
William Ochandarena
Senior Director of Product Management
Will Ochandarena is Senior Director of Product Management at MapR, responsible for Cloud and IoT product strategy. He spends time with customers across several industries, including manufacturing, retail, and energy, helping them use MapR’s converged data fabric at the edge to solve new and interesting business problems. He also writes blogs on -