Abstract: Modern Natural Language Processing (NLP) is not about creating big datasets and training models from scratch. The most innovative applications of NLP these days leverage large pre-trained, foundation models along with fine-tuning, prompt engineering, and/or human feedback.
Leaving this workshop, you will understand each of these topics, and you will have gained the practical, hands-on expertise to start integrating modern NLP in your domain. Participants will fine-tune and prompt engineer state-of-the-art models like BART and XLM-Roberta, and they will peer behind the curtain of world shaking technologies like ChatGPT to understand their utility and architectures.
Lesson 1: Introduction to Modern NLP, Pre-training
Familiarize yourself with workflows and terminology of modern NLP set against the backdrop of traditional approaches to processing natural language. At the end of this lesson, you will be able to confidently explain the recent explosion of NLP applications and the main technological advances in large language models (LLMs) that are driving them.
Lesson 2: Fine-tuning
Understand how fine-tuning an NLP model differs in comparison to from scratch training, and practice fine-tuning a question answering model. At the end of this lesson, you will be able to apply a fine-tuning workflow to customize a LLM to fit your domain or task.
Lesson 3: Prompt engineering
Learn about the workflows of prompt engineering and zero shot prediction, and practice prompt engineering to get the best generated text for your task. At the end of this lesson, you will be able to construct prompts for LLMs that guide a model to generate useful output.
Lesson 4: Human feedback
Jump into the deep end to learn about the advances behind the wildly popular ChatGPT. At the end of this lesson, you will be able to clearly articulate and explain the methodologies that were implemented in creating ChatGPT (and similar models), and you will have an intuition for how human feedback and human labeling might be integrated into your own modern NLP workflows.
Pre-requisite knowledge: a foundational understanding of common, supervised machine learning workflows (pre-processing, training, evaluation, inference);
Tools and languages: basic usage of Python and Jupyter notebooks;
Equipment: a laptop with Internet access and a common browser.
Bio: Daniel Whitenack (aka Data Dan) is a Ph.D. trained data scientist working with SIL International on NLP and speech technology for local languages in emerging markets. He has more than ten years of experience developing and deploying machine learning systems at scale. Daniel co-hosts the Practical AI podcast, has spoken at conferences around the world (Applied Machine Learning Days, O’Reilly AI, QCon AI, GopherCon, KubeCon, and more), and occasionally teaches data science/analytics at Purdue University.