Abstract: Much of data is sequential – think speech, text, DNA, stock prices, financial transactions and customer action histories. Modern methods for modelling sequence data are often deep learning-based, composed of either recurrent neural networks (RNNs) or the attention-based transformer. A tremendous amount of research progress has recently been made in sequence modelling, particularly in the application to NLP problems. However, the inner workings of these sequence models can be difficult to dissect and intuitively understand.
This presentation/tutorial will start from the basics and gradually build upon concepts in order to impart an understanding of the inner mechanics of sequence models – why do we need specific architectures for sequences at all, when you could use standard feed-forward networks? How do RNNs actually handle sequential information, and why do LSTM units help longer-term remembering of information? How can Transformers do such a good job at modelling sequences without any recurrence or convolutions? And whatever happened to Markov chains?
Specific use cases for sequence modelling will be discussed – including sentiment analysis (prediction of the emotional valence of a piece of text), machine translation (automatic translation between different languages), and the functional modelling of DNA and protein sequences (predicting which regions of biological sequences are functional and what that function could be).
The goals of this presentation are to provide an overview of popular sequence-based problems, impart an intuition for how the most commonly-used sequence models work under the hood, and show that quite similar architectures are used to solve sequence-based problems across many domains.
Bio: Dr. Natasha Latysheva is a machine learning engineer in the NLP group at Welocalize, a leading language services provider. Her work focuses on machine translation and natural language processing. She previously worked as a data scientist in a video game studio, and before that on a PhD in computational biology at Cambridge.