Enterprise are replicating data on Hadoop to either improve high availability, disaster recovery or just get the data into an ephemeral cluster in the cloud. However being able to do data replication in a location-agnostic and secure way can be challenging using existing solutions. Enterprise are looking into enabling data to be encapsulated and copied seamlessly across the physical private storage and public cloud environments for hybrid data mobility to enable right workload in the right environment for the right use case.
Data Lifecycle Manager (DLM) is a new enterprise offering based on open source. DLM is a stepping stone for established enterprises to leverage intelligent replication, backup, and tiering to reach the goal of secure, compliant, governed Business Continuity Planning to satisfy enterprise’s economies of scale. Secure movement of data-at-scale with precise access policies will enable enterprises to reduce capital costs, comply with data governance while democratizing data to grow the business globally.
This session will provide an overview of the data replication uses cases, current challenges and provide an introduction to how DLM addresses them. We will discuss hybrid replication challenges of data proliferation with varied volume, velocity and variety using Apache Hive, Apache Ranger and Apache Atlas to protect the data, metadata and security at scale independent of where the data was collected, cured, moved or stored within the confines of an enterprise.