PayPal is a data-driven organization, processing billions of identity, payment, risk, user and web behavioral events per day. The analytics and data platform is the one powering these different data needs, use cases and growth story at PayPal. Some data use cases, require data to be available in near real time to make timely decision and improve the customer experience. The architecture, data processing and pipeline for handling such cases differs from other paradigms.
This talk will cover PayPal’s journey into realtime analytics, how PayPal is processing and handling real time data at scale using Apache Kafka, Spark streaming and Akka streaming, what limitation and challenges are involved with real time data pipeline and how signals from realtime analytics are used for quick feedback and decision making.