An Annotation Pipeline for Domain-Specific Dialogue Systems
An Annotation Pipeline for Domain-Specific Dialogue Systems


Task-oriented dialogue systems require large volumes of data with both breadth and depth to perform effectively. While many in the industry shy away from gathering human-labeled data with the advent of increasingly powerful unsupervised methods in NLP, high-quality labeled data remains essential to boosting the performance of dialogue systems, particularly in specialized domains like medicine and finance, and for complex linguistic structures like long-distance anaphora and dialogue states.

Obtaining high-quality labeled data to train and validate dialogue models for specialized contexts faces two perennial challenges: quality and scalability. This talk presents an approach to a human annotation workflow at iMerit that addresses both of these concerns, while also yielding additional insights to further enrich and refine dialogue systems. The talk describes how iMerit’s iterative training, annotation, and evaluation process achieves quality and scale at a low-cost through the collaboration of subject-matter experts and skilled labelers.


Dr. Teresa O’Neill is a Solutions Architect at iMerit specializing in language annotation services. Before joining iMerit, she worked for a decade in academia as an educator and researcher. At iMerit, she leverages her experience as a linguist with both theoretical and applied specializations to build custom human-in-the-loop annotation pipelines for customers with NLP/NLU and e-commerce use cases.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google