Abstract: Wherever you are in your team’s machine learning journey, it’s helpful to think about evolving towards large scale production. A key ingredient of this journey is your data labeling and annotation framework. In this talk we focus on how to build your data labeling pipeline to be enterprise grade. We will describe the considerations and insights that go into making your data pipeline a mindful part of your development pipeline.
Proactively planning a data process can generate progressively better results during development, but it requires some thought and stakeholder buy-in. The financial and overhead costs of data labelling need to be considered. Through peer, manager, and machine-learning expert collaboration, annotators refine their skills, mastering tasks traditionally beyond the expertise of crowdsourcing. Finally, in a collaborative framework, labellers and machine-learning experts negotiate and create meaning through an iterative feedback process as they identify new concepts and nuances in the data.
A pipeline designed for human judgement and incremental training on edge cases, can provide that last mile of acceptability to roll out a machine learning solution in production. We will describe successful examples of this approach.
Bio: Jai Natarajan is VP, Marketing and Strategic Business Development, at iMerit. iMerit has over 2500 data experts who label and enrich data at scale to help customers get better results from their algorithms. It does so while empowering women and youngsters in underprivileged communities to join the digital economy. iMerit works with leaders across sectors like Autonomous Vehicles, AgTech, Medical Imaging, e-Commerce and Financial Services, with diverse types of image, video and text data.
Jai’s background is in Computer Graphics and Education. Before joining iMerit, he founded Emmy- winning animation studio Xentrix, and has previously worked at Lucasfilm and Sony. He also serves on the board of Anudip Foundation, a livelihood development non-profit that trains thousands of marginalized youngsters in digital skills.