Security event logging and monitoring techniques for incident response in Hadoop

Security event logging and monitoring techniques for incident response in Hadoop

Tuesday, June 19
4:50 PM - 5:30 PM
Executive Ballroom 210A/E

Many compliance and regulatory bodies—such as HIPAA "Security Rule" for electronic protected health records (ePHI) or controlled unclassified information (CUI) for defense projects—require logging and monitoring of all access to sensitive data such as patient records or CUI data. The challenge for Hadoop clusters, is that they are complex ecosystems that run a variety of services over multiple nodes. For example, services such as Knox, Ranger, Atlas, Oozie, and Hive differ greatly from each other—each with its own format and schema for tracking data access and each likely running on different nodes within the HDP cluster. Hence, collecting the right set of logs and events for each of the data or security services is the first challenge. Furthermore, security monitoring and threat detection or breach analysis typically relies on correlation of these security events. This implies centralizing logs and events where they can then be normalized, indexed, aggregated, and correlated for analysis in a security information and event management (SIEM) tool such as SumoLogic or Splunk or equivalent.

This presentation will share some of the techniques and lessons learned in real-world Haddop implementation at Johns Hopkins. Data will be sanitized as expected. But the focus will be on strategies and techniques used to collect and monitor audit and access log events from key Hadoop services and forwarding to a central server for monitoring, analysis, and response to any suspected breaches or incidents. Automation techniques, such as Ansible scripts to install agents or forwarders uniformly and efficiently across the cluster nodes will also be highlighted where appropriate.

Presentation Video


Conrad Fernandes
Cloud and Cyber Engineering Lead
Johns Hopkins Applied Physics Laboratory
Conrad Fernandes is a long time cyber security engineer and architect, and has worked extensively with US Defense agencies and the DoD since the early 2000's while at Booz Allen Hamilton. Conrad currently serves as a senior cyber security engineer at the Johns Hopkins Applied Physics Laboratory (APL), where he leads security and governance practices on emerging cloud technologies, including commercial and US GovCloud (e.g., Amazon web services) and Hadoop-based data science platforms from Cloudera and Hortonworks. Conrad recently presented strategies for "Incident Response and Spillage Handling in AWS" at Amazon's 2018 reInvent Conference. Additionally, Conrad has been researching and implementing security and audit logging and monitoring strategies on data science platforms at Johns Hopkins Medical Institute (JHMI) that utilize various emerging security services found within Hortonworks Data Platform (HDP). Conrad also enjoys sharing security best-practices and lessons-learned from the experiences with the larger cloud and big-data community.