Abstract: Large language models have been touted as ushering in a new technological revolution, thanks to impressive capabilities demonstrated by the likes of GPT-4. However, the journey from prototype to production still seems daunting. Several failure modes plague state-of-the-art LLMs of today, hindering seamless integration into the existing tech ecosystem. In this workshop, we will explore the landscape of open-source LLMs and provide a playbook on how to effectively utilize them to build production-ready applications. We will show how to choose an LLM that best fits your task, among the plethora of choices available. We will demonstrate several fine-tuning techniques that enable you to adapt the LLM to your domain of interest. We will also discuss techniques to deal with reasoning limitations, hallucinations, bias and fairness issues, which are critical to ensure your applications are helpful and harmless when deployed in the real world. The workshop will be hands-on and composed of 4 modules, targeting data scientists, machine learning experts, software engineers, and product managers. Prior exposure to LLMs is not necessary. At the end of the workshops, participants will have gained insights on how to effectively make the best of open-source LLMs and learned the necessary steps to bridge the gap between prototype and production-ready applications.
Module 1: How to choose the best LLM for your task
In this module, we will explore the open-source LLM landscape, outline the relevant criteria for choosing an LLM for a given talk, and showcase ways to evaluate LLMs. We will also provide practical tools to explore the pre-training datasets used to train these LLMs, as well as their tokenization and vocabulary, which have a significant impact on downstream performance.
Module 2: Effective ways to fine-tune LLMs
We will explore ways to effectively adapt LLMs to your domain of interest. We will demonstrate the power of continued pre-training, and showcase different fine-tuning techniques including parameter-efficient fine-tuning techniques like LoRA.
Module 3: Combating failure modes of LLMs
While hallucination, bias and fairness issues, and reasoning limitations cannot be ‘solved’ yet, we can work towards minimizing them. This module will provide tips and tricks to address these limitations.
Module 4: Design Considerations for LLM Applications
LLM applications, with their existing limitations, will need to be carefully designed in order to shield harmful, unpredictable, and suboptimal behavior from the user. This module will provide design tips including how to combine multiple LLMs in a workflow to solve a task, by dividing the task into subtasks that can be distributed among different LLMs, where each LLM focuses on the subtask that it is most suited to solve.
Learn how to choose the right LLM for your task of interest
Learn how to adapt open-source LLMs to your target domain, data, and task, and fine-tune it.
Learn how to minimize issues stemming from hallucinations and other LLM limitations.
Learn how to effectively design LLM applications and build a pipeline encompassing multiple LLMs that are used collaboratively to solve a task.
Open Source tools/libraries used: HuggingFace, PyTorch.
Python programming skills, basic NLP knowledge
Bio: Suhas Pai is a NLP researcher and co-founder/CTO at Bedrock AI, a Toronto based startup. At Bedrock AI, he works on text ranking, representation learning, and productionizing LLMs. He is also currently writing a book on Designing Large Language Model Applications with O'Reilly Media. Suhas has been active in the ML community, being the Chair of the TMLS (Toronto Machine Learning Summit) conference since 2021 and also NLP lead at Aggregate Intellect (AISC). He was also co-lead of the Privacy working group at Big Science, as part of the BLOOM project.