Saurabh is a Systems Architect with strong expertise in Hadoop ecosystem and rich field experience. He helps large to small enterprises solve their business problems strategically, functionally and at scale by leveraging Bigdata technologies. He is equipped with hands-on experience building, coding and directing successful information technology initiatives.
Saurabh has over 14 years of strong IT experience and has served in key positions as Lead Big Data Solution Architect, Performance Architect, Technology Architect in multiple large and complex enterprise programs.
He has extensive knowledge of BigData/NoSql technologies including Hadoop, Yarn, Spark, Hbase, Hive, Pig, Storm, Kafka, Nifi etc. and has been working in this space for last 6+ years.
Saurabh has architected and designed big data platforms and applications that consist of 1000s of nodes , 10s of Petabytes of data and Complex ETL workflows requirements.
Saurabh have provided solutions for GDPR requirement, Just-In-Time analytics, leveraging co-located datasets at scale to provide insight and pattern detection. Building data pipelines that produce results in minutes or hours across peta-bytes of data. Building and discovering new ways to co-locate, integrate and leverage disparate datasets using the Lambda and HTAP BigData architecture and IOT applications.
Arun is a Distributed Systems Architect from Hortonworks who has Authored/Implemented/ Optimized Big-Data Pipelines for many Fortune 500 firms like Apple, Microsoft, SAP, ADP, Fidelity, Expedia, Monsanto, TMobile etc.
Enterprise Big Data Pipelines typically solve Analytics/Warehousing Challenges and comprise of Streaming/Batch/Hybrid processing paradigms. Comprehending the nuances of these Peta-Byte through-put pipelines, analyzing their data flow efficiencies needs in-depth architectural understanding of utilized frameworks/components within pipeline applications. Arun’s unique ability to visualize pipeline dataflows horizontally (data partitioning, shuffling, combining, at a distributed framework level) and vertically (application request, cache, page-cache, disk layer) helps design/ untangle/unclog/optimize pipelines very efficiently.
Distributed Datasets, Map Reduce based Directed Acyclic Graph solutions, Distributed Stream Processing, Column Oriented storage, Column Family Oriented Stores, Dynamic Partitioned Rings, Machine Learning Algorithms and TFD based Spark/Hadoop ETL workflows are some of the topics within Arun’s area of interest/expertise.
Prior to Hortonworks, Arun spent about a decade with Monsanto, working on biotech research in Hadoop as well as various custom in-house distributed parallel processing/storage platforms. Other previous experience includes stints at ADP, Caliper and Yahoo and research work for NASA. Arun’s expertise includes SOA, enterprise security, distributed computing/caching, and scalable read/write behind caching architectures. Arun holds a master’s degree in computer science from University of Alabama and a bachelor’s degree in engineering from Sri Venkateswara College of Engineering, India.