Open source computer vision with TensorFlow, Apache MiniFi, Apache NiFi, OpenCV, Apache Tika, and Python

Open source computer vision with TensorFlow, Apache MiniFi, Apache NiFi, OpenCV, Apache Tika, and Python

Tuesday, June 19
4:00 PM - 4:40 PM
Grand Ballroom 220A

For processing images from IoT devices like Raspberry Pis, NVidia Jetson TX1, NanoPi Duos, and more that are equipped with attached cameras or external USB webcams, we use Python to interface via OpenCV and PiCamera. From there we run image processing at the edge on these IoT device using OpenCV and TensorFlow to determine attributes and image analytics. Apache MiniFi coordinates running these Python scripts and decides when and what to send from that analysis and the image to a remote Apache NiFi server for additional processing. Also includes using custom NiFi processors for Atrribute Cleaning, Apache Tika, TensorFlow and Camera ingest.

At the Apache NiFi cluster, in the cluster it routes the images to one processing path and the JSON encoded metadata to another flow. The JSON data (with its schema referenced from a central Schema Registry) is routed using Record Processing and SQL.

My talk at Data Works Summit Sydney was listed in top 7

Presentation Video


Timothy Spann
Field Engineer, Data in Motion
Tim Spann was a Senior Solutions Architect at AirisData working with Apache Spark and Machine Learning. Previously he was a Senior Software Engineer at SecurityScorecard (" helping to build a reactive platform for monitoring real-time 3rd party vendor security risk in Java and Scala. Before that he was a Senior Field Engineer for Pivotal focusing on CloudFoundry, HAWQ and Big Data. He is an avid blogger and the Big Data Zone Leader for Dzone ( He runs the the very successful Future of Data Princeton meetup with over 1192 members at He is currently a Senior Solutions Engineer at Cloudera in the Princeton New Jersey area. You can find all the source and material behind his talks at his Github and Community blog:
Nagaraj Jay is a Systems Architect at Hortonworks. His main interests are on Spark ML, AI, Flink, Geospatial data replication and data integration.