Machine Learning Models in Production

Machine Learning Models in Production

Wednesday, April 18
2:00 PM - 2:40 PM
Convention Hall I - C

Data Scientists and Machine Learning practitioners, nowadays, seem to be churning out models by the dozen and they continuously experiment to find ways to improve their accuracies. They also use a variety of ML and DL frameworks & languages , and a typical organization may find that this results in a heterogenous, complicated bunch of assets that require different types of runtimes, resources and sometimes even specialized compute to operate efficiently.

But what does it mean for an enterprise to actually take these models to "production" ? How does an organization scale inference engines out & make them available for real-time applications without significant latencies ? There needs to be different techniques for batch (offline) inferences and instant, online scoring. Data needs to be accessed from various sources and cleansing, transformations of data needs to be enabled prior to any predictions. In many cases, there maybe no substitute for customized data handling with scripting either.

Enterprises also require additional auditing and authorizations built in, approval processes and still support a "continuous delivery" paradigm whereby a data scientist can enable insights faster. Not all models are created equal, nor are consumers of a model - so enterprises require both metering and allocation of compute resources for SLAs.

In this session, we will take a look at how machine learning is operationalized in IBM Data Science Experience (DSX), a Kubernetes based offering for the Private Cloud and optimized for the HortonWorks Hadoop Data Platform. DSX essentially brings in typical software engineering development practices to Data Science, organizing the dev->test->production for machine learning assets in much the same way as typical software deployments. We will also see what it means to deploy, monitor accuracies and even rollback models & custom scorers as well as how API based techniques enable consuming business processes and applications to remain relatively stable amidst all the chaos

Presentation Video


Piotr Mierzejewski
Program Director Development IBM DSX Local
Piotr begun his career at IBM in 2009 at IBM’s Centre for Advanced Studies as a Prototype Developer concentrating on graph theory and linked data applications. After two years in research Piotr joined DB2 SQL compiler development team to lead improvements in automation and problem detection in DB2 engine. ​ In 2014 Piotr transitioned to manage DB2 Build team were his team was able to redesign and innovate drastically improving DB2 organization operations. In 2016 Piotr moved to lead IBM Watson Data Platform Security, Integration and DevOps teams Most recently Piotr was promoted to Program Director of IBM Data Science Experience, new data science and machine learning platform for private clouds.