Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration with Common Data Framework (Model Robot)

Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration with Common Data Framework (Model Robot)

Wednesday, June 20
11:50 AM - 12:30 PM
Grand Ballroom 220C

Freddie Mac and KPMG will share an innovative solution to accelerate data model (ERM) development and data integration on a highly-distributed, in-memory computing platform. The machine learning component (PySpark) of the framework executes against evolving semi-structured and structured data sets to learn and automate data mapping from various sources to a targeted schema. As a result, it significantly reduces the manual analysis, design and development effort, as well as establishes faster data integration across a variety of complex and high-volume datasets.
The solution will leverage various components of the Hadoop data platform. It will use Sqoop to import the data into the platform. PySpark will be leveraged in order to process the data. In addition, the application will also have a developed PySpark ML model that will run as a continuous job in Spark to process the ingested semi-structured data and intelligently map into the proper Hive tables. This will all be scheduled thru the use of Oozie.

Presentation Video


Kevin Martelli
Managing Director
Kevin Martelli is a Managing Director at KPMG and in this role he is the U.S. Technology Lead in KPMG’s Lighthouse – Data & Analytics Solution Center. In this capacity, he is responsible for the technology platforms and applications that are leveraged to deliver data and analytic solutions to address client needs in Artificial Intelligence (AI), Big Data, and Blockchain space. In addition he also supports and leads the build out of technology solutions and analytical data science use cases at clients sites and oversees the Big Data software Engineer team.
Balaji Wooputur
Risk Analytics Director
Freddie Mac
Balaji Wooputur is a Director in Data Analytics at Freddie Mac leading Business Data solutions in BigData platform for Single Family Risk. He has extensive leadership on Data solutions, Diginomics and Advance analytics in building and establishing business data solution across diverse industries like Finance, Federal and various verticals. Currently, he is responsible for developing and managing business data solutions on Big Data, AI, Diginomics and Advance analytics using Machine Learning for Single Family Credit Reporting Analytics (CAR).