
Abstract: Natural Language Processing is one of the fastest growing fields in Machine Learning. Understanding language has been a challenge for machines, but recent developments have shown the gap rapidly decreasing. Transformers has been a buzzword for every NLP scientist and engineer, but how does it work under the hood? We will take you on an NLP journey, starting from Long Short Term Memory (LSTM) networks to Transformers, filling every gap on the way. We will work on the Grammatical Error Correction dataset, and explore both theoretical and practical aspects of this journey.
Session Outline
Our workshop is designed in a semi hands-on format. You will learn about:
1. Fundamentals of Long Short Term Memory (LSTM) networks
2. Sequence to Sequence models using LSTM
3. Sequence to Sequence models using LSTM with Attention
4. Get rid of LSTM - Enter Transformers!
The above building blocks will be supported by hands-on using the Grammatical Error Correction dataset. This dataset contains grammatically incorrect sentences with their corrected forms. Grammatical Error Correction (GEC) is the task of correcting different kinds of errors in text such as spelling, punctuation, grammatical, and word choice errors. It assists non-native english speakers in writing more robust formal, informal and student content.
We will highlight the pros and cons of each model and evaluate the performance against a gold standard test set. By the end of the workshop, you will be well equipped to use transformer based models not only for seq2seq tasks but also sequence classification.
Background Knowldege
Machine Learning, Natural Language Processing, Python
Bio: Eram is a Lead Data Scientist at Tokopedia, which is an Indonesian e-commerce giant encompassing 1% of Indonesia’s GDP. With over 7 years of experience in Machine Learning specializing in Natural Language Processing, her work has been focused on developing real world AI at scale. Her passion for NLP has led her to become a content creator and a mentor for junior data scientists. Her motto is to give back to the NLP community by instilling self motivated learning.