Being able to teach machines with examples is a powerful capability, but it hinges on the availability of vast amounts of data. The data not only needs to exist, but has to be in a form that allows relationships between input features and output to be uncovered. Creating labels for each input feature fulfills this requirement, but most supervised machine learning opportunities do not come nicely packaged with labeled data.
In classical approaches to this problem, engineered heuristics are used to select “best” instances of data to label in order to reduce cost; the model then learns from this smaller labeled dataset. Recent advancements have extended these approaches to deep learning, enabling models to be built with limited labeled data.
In this talk, we explore algorithmic approaches that drive this capability, and provide practical guidance for translating this capability into production.