Model Factory at ING Bank

Wednesday, March 20
11:00 AM - 11:40 AM
Room 120-121

At ING Bank, machine learning models are a key factor in making relevant engagements with our customers, empowering them to stay a step ahead in life and in business. In our efforts to make the model building process more rapid, compliant, validated and accessible to roles other than data scientists (such as data analysts or customer journey experts), we have structured it for an easy creation of propensity models.

In this talk, I will present this structure, focusing on pipelining data science models in Apache Spark. In particular, I will show how we use Apache Sqoop & Ranger to comply with GDPR, build a data science workflow on top of python and Jupyter, extend the SparkML libraries on PySpark to create custom standardizers and cross-validators, and show an in-house developed monitoring tool built on top of Elasticsearch for model evaluation.

Finally, I will describe the type of engagement analysts and customer journey experts have with the result set of the models created, and how we refine our dashboards (in IBM Cognos) accordingly.


Dor Kedem
Lead Data Scientist
ING Bank
Dor has over a decade of experience developing big data products for security industries, financial markets and banking industries. His research on metric learning and cost-sensitive learning has earned him publications in NIPS, AISTATS and a monetary prize in Cha-Learn competitions. As a senior data scientist at ING Bank, he is involved with multiple projects modelling consumer and market behavior, optimizing business and IT processes and contributing to the data science way-of-working, rapid exploration and continuous delivery processes.