Hortonworks is sponsoring a quick, hands-on introduction to key Apache projects. Come and listen to a short technical introduction and then get hands-on with your personal machine, ask questions, and leave with a working environment to continue your journey.

Apache Spark Crash Course

Introduction: This workshop will provide a hands-on introduction to Apache Spark and Apache Zeppelin in the cloud.

Format: A short introductory lecture on Apache Spark covering core modules (SQL, Streaming, MLlib, GraphX) followed by a demo, lab and a Q&A session.

Objective: To provide a quick and short hands-on introduction to Apache Spark. This lab will use the following Spark and Apache Hadoop components: Spark, Spark SQL, Apache Hadoop HDFS, Apache Hadoop YARN, Apache ORC, and Apache Ambari Zepellin. You will learn how to move data into HDFS using Spark APIs, create Apache Hive tables, explore the data with Spark and Spark SQL, transform the data and issue several SQL queries.

Lab pre-requisites: Registrants must bring a laptop with a Chrome or Firefox web browser installed (with proxies disabled, i.e. must show venue IP to access cloud resources).

At this Crash Course everyone will be assigned a cluster to try several workloads in the cloud.

Checkout our short video on Apache Spark Basics.


Speakers: Robert Hryniewicz

Location: Meeting Room 212C/D

Data Science Crash Course

Introduction: This workshop will provide a hands on introduction to basic Machine Learning techniques with Apache Spark ML using the cloud.

Format: A short introductory lecture on a select important supervised and unsupervised Machine Learning techniques followed by a demo, lab exercises and a Q&A session.

Objective: To provide a quick and short hands-on introduction to Machine Learning with Spark Machine Learning library (MLlib). In the lab, you will use the following components: Apache Zeppelin (a “Modern Data Science Toolbox”) and Apache Spark. You will learn how to analyze the data, structure the data, train Machine Learning models and apply them to answer real-world questions.

Pre-requisites: Registrants must bring a laptop with a Chrome or Firefox web browser installed (with proxies disabled, i.e. must show venue IP to access cloud resources).

At this Crash Course everyone will be assigned a cluster to try several workloads in the cloud.

Checkout our short video on Basic Machine Learning Algorithms.


Speakers: Robert Hryniewicz

Location: Meeting Room 212C/D

Apache Nifi Crash Course

Introduction: This workshop will provide a hands on introduction to simple event data processing and data flow processing using a Sandbox on students’ personal machines.

Format: A short introductory lecture to Apache NiFi and computing used in the lab followed by a demo, lab exercises and a Q&A session. The lecture will be followed by lab time to work through the lab exercises and ask questions.

Objective: To provide a quick and short hands-on introduction to Apache NiFi. In the lab, you will install and use Apache NiFi to collect, conduct and curate data-in-motion and data-at-rest with NiFi. You will learn how to connect and consume streaming sensor data, filter and transform the data and persist to multiple data sources.

Pre-requisites: Registrants must bring a laptop that has the latest VirtualBox installed and an image for Hortonworks DataFlow (HDF) Sandbox will be provided.


Speakers: Andy LoPresto

Location: Meeting Room 212C/D

Hadoop Security, Governance and GDPR Crash Course

Introduction: This workshop will  provide an overview of GDPR provisions along with relevant use cases.

Format: A short introductory lecture on GDPR.  Then we will focus on the topics of consent, profiling and right to be forgotten or data erasure and how companies can establish processes for acquiring consent, automated data processing, data discovery and classification using technologies such as Apache Atlas, Apache Ranger and Apache Hive.

 

Objective: To provide a quick and hands-on introduction to GDPR concepts.  In the lab you practice the concepts using Apache Hadoop, Atlas, Ranger and Hive to process and classify data.

Pre-requisites: Registrants must bring a laptop that has the latest VirtualBox installed and an image for Hortonworks Data Platform (HDP) Sandbox will be provided.


Speakers: Ali Bajwa, Srikanth Venkat

Location: Meeting Room 212A/B

CyberSecurity with Apache Metron Crash Course

Introduction: This workshop will provide a hands-on introduction to CyberSecurity powered by Apache Metron.

Format: A short introductory lecture on Apache Metron use cases and architecture, followed by a demo, lab exercises, and a Q&A session.

Objective: To provide a quick and short hands-on introduction to Apache Metron. In the lab, you will learn how to ingest and parse new security telemetry sources, add additional context by enriching the events, track, visualize the results and identify outlier behavior (aka; detect the bad actors).

Pre-requisites: Registrants must bring a laptop with an SSH client (e.g. Putty) and an up-to-date web browser (e.g. Chrome) installed, which will be used to connect to a cloud instance of Apache Metron.


Speakers: Ward Bekker, Dave Russell, Carolyn Duby

Location: Meeting Room 212A/B

Deep Learning Crash Course

Introduction: This workshop will provide a hands on introduction to basic Deep Learning techniques with TensorFlow, Apache MXNet and Keras.

Format: A short introductory lecture on select key applications of Deep Learning, Neural Net structures required to address these applications, and other relevant background information to successfully train and deploy a Deep Learning model. The lecture portion will be followed by lab exercises and a Q&A session.

Objective: To provide a quick and short hands-on introduction to Deep Learning with TensorFlow, MXNet, and Keras in a notebook style environment (Apache Zeppelin or Jupyter). You will learn how to choose the right Neural Net for the desired application, train, test, and deploy a Deep Learning model.

Pre-requisites: Registrants must bring a laptop with a Chrome or Firefox web browser installed (with proxies disabled, i.e. must show venue IP to access cloud resources).  These labs will be done in the cloud.

At this Crash Course everyone will be assigned a cluster to try several workloads using TensorFlow, Apache MXNet and Keras in Zeppelin or Jupyter notebooks hosted in the cloud.

Checkout our short video on Basic Machine Learning Algorithms.


Speakers: Robert Hryniewicz, Timothy Spann

Location: Meeting Room 212C/D

Data in the Cloud Crash Course #2

Introduction

This workshop is a hands-on session to quickly deploy Hadoop and Streaming on AWS / Azure / Google Cloud.

Cloudbreak simplifies the deployment of Hadoop in cloud environments. It enables the enterprise to quickly run big data workloads in the cloud while optimizing the use of cloud resources.

Format

A short introductory lecture about Cloudbreak. This is followed by a walk through and lab leveraging Hadoop and Streaming in the Cloud with Cloudbreak.

Objective

To provide a quick and short hands-on introduction to Hadoop on the cloud. Review key benefits of cluster deployment automation.

This lab will use Cloudbreak to quickly and effortlessly stand up Hadoop and Streaming clusters in a cloud provider of your choice. The lab shows the use of Ambari blueprints that are your declarative definitions of your Hadoop or Streaming clusters. Steps to dynamically change these blueprints and use external databases and external authentication sources and in essence showing a way to provide Shared Authentication, Authorization and Audit across ephemeral and long-lasting clusters. However it is not limited to only custom blueprints, the lab also shows how Cloudbreak provides easy to use custom scripts called recipes that can be executed before or after Ambari start or after cluster installation.

Pre-requisites

Registrants must bring a laptop for the lab.  These labs will be done in the Cloud. Please follow below steps to setup an AWS or Azure account prior to this session starting.


Speakers: Michael Young, Purnima Kuchikulla

Location: Meeting Room 212A/B

Data in the Cloud Crash Course #1

Introduction

This workshop is a hands-on session to quickly deploy Hadoop and Streaming on AWS / Azure / Google Cloud.

Cloudbreak simplifies the deployment of Hadoop in cloud environments. It enables the enterprise to quickly run big data workloads in the cloud while optimizing the use of cloud resources.

Format

A short introductory lecture about Cloudbreak. This is followed by a walk through and lab leveraging Hadoop and Streaming in the Cloud with Cloudbreak.

Objective

To provide a quick and short hands-on introduction to Hadoop on the cloud. Review key benefits of cluster deployment automation.

This lab will use Cloudbreak to quickly and effortlessly stand up Hadoop and Streaming clusters in a cloud provider of your choice. The lab shows the use of Ambari blueprints that are your declarative definitions of your Hadoop or Streaming clusters. Steps to dynamically change these blueprints and use external databases and external authentication sources and in essence showing a way to provide Shared Authentication, Authorization and Audit across ephemeral and long-lasting clusters. However it is not limited to only custom blueprints, the lab also shows how Cloudbreak provides easy to use custom scripts called recipes that can be executed before or after Ambari start or after cluster installation.

Pre-requisites

Registrants must bring a laptop for the lab.  These labs will be done in the Cloud. Please follow below steps to setup an AWS or Azure account prior to this session starting.


Speakers: Michael Young, Purnima Kuchikulla

Location: Meeting Room 212C/D