Abstract: Most of the recent advances in the deep learning field come at a high price. The costs involved in developing and training these models are two-fold: namely, they can be attributed to computing power and training data. Computational resources are getting increasingly more affordable through the wide spread of cloud computing services. On the other hand, gathering and especially manually labeling data cannot not scale in the same way. A common scenario is that in which unlabeled data comes cheap, but the labeling budget is severely limited. Practice shows that all data is not created equal: the choice of which data is prioritised to be labeled has a profound effect on the final performance of the resulting model. The task of determining which data samples would be most "informative" when labeled, goes under what is known as active learning.
In part 1 of my talk, I will present a theoretical overview of the ideas behind active learning when applied to an image classification problem.
In part 2, we will see how these ideas can be implemented using PyTorch.
Part 1 (the theory) only requires knowledge of standard supervised machine learning concepts, e.g. multi-class image classification. To get the most use of part 2 (the practical tutorial), proficiency with the PyTorch deep learning framework is recommended.
Bio: Olga is a deep learning R&D engineer at Scaleway, the second largest french cloud provider. She received her PhD in theoretical physics from Johns Hopkins University in 2013, followed by postdoctorate appointments at the Max Planck Institute in Dresden and the École Normale Supérieure in Paris. In the latter, she looked into the possible applications of artificial intelligence to quantum systems, among other things.
Olga’s current interests focus on semi-supervised and active machine learning. On the community side, she enjoys blogging about the latest advancements in AI both in and out of working hours. Some of her writing can be seen on medium.com/@olgapetrova_92798