LLM Best Practises: Training, Fine-Tuning and Cutting Edge Tricks from Research

Abstract: 

Large Language Models (LLMs) are still relatively new compared to ""Traditional ML"" techniques and have many new ideas as best practises that differ from training ML models.Fine-Tuning models can be really powerful to unlock use-cases based on your domain and AI Agents can be really powerful to unlock previously impossible ideas.

In this workshop, you will learn the tips and tricks of creating and fine-tuning LLMs along with implementing cutting edge ideas of building these systems from the best research papers.

We will start by learning the foundations behind what makes a LLM, quickly moving into fine-tuning our own GPT and finally implementing some of the cutting edge tricks of building these models.

There is a lot of noise and signal in this domain right now, we will focus on understanding the ideas that have been tried and tested.

The workshop will also cover case studies spanning ideas that have worked in practise we will dive deep into the art and science of working with LLMs.

Session Outline:

Module 0: Introduction and History

In this module, you will grasp the history and current SOTA of the landscape. We will quickly dive into our first hands on experience comparing LLMs to (Small) Language Models to truly understand their power

- History of LLMs
- What makes a Large Language Model
- Understanding the current SOTA
- Hands on: LLMs Vs LMs for different use cases

Module 1: Training LLMs and Best Practises

Having learned about how these models work, we will now understand their promise and how to make them.
There are many paths to effectively building with LLMs, we will learn best tricks of both Prompt Engineering and Fine-Tuning.
We will then apply this knowledge to fine-tune our first GPT. This is the most powerful aspect of the landscape, for <30$, we will learn how to create a LLM tailored to your use case

- Challenges of training from scratch
- Current Open Source Landscape
- How to evaluate LLMs
- Prompting Vs Fine-Tuning
- Best Practises from practise
- Training your own GPT

Module 2: Cutting edge tricks to improve your LLM

Having created our own custom LLM, we will now learn the top tricks from research and how to further improve our model. Over the past few months, we have been observing breakthrough models and ideas everyday. In this module, you will understand the most useful of ALL of these.

- Top tricks from LLM research
- Implementing these ideas in practise
- Understanding Agents and AI Agent landscape
- Creating LLM Apps and AI Agent Apps

This session is primarily aimed at data scientists, ML practitioners, and enthusiasts from various industries who are eager to harness the power of LLMs in their respective domains. Whether you hail from healthcare, finance, e-commerce, or any other sector, you will walk away with actionable insights and techniques that can be immediately applied to real-world challenges.

Learning Outcomes:

- Grasp the foundational concepts and distinctions between LLMs and Traditional ML.
- Gain hands-on experience in fine-tuning GPT-based models for specific use cases.
- Understand and implement proven strategies from leading research papers in the domain of LLMs.
- Discern between well-established techniques and experimental ideas, aiding in informed decision-making during model deployment.
- Analyze real-life case studies to understand the potential pitfalls and success stories in the practical application of LLMs.

By the end of this session, attendees will have a solid foundation in the realm of Large Language Models, coupled with a clear roadmap on how to implement, fine-tune, and optimize these models for their unique industry needs.

Background Knowledge:

Understanding of Deep Learning Models, concepts and implementations and working knowledge of Python. Ideally, you should have trained ML models and built systems around this technology and be comfortable working with Python

In my opinion, passion trumps everything-so if you are passionate, please ignore the above and join still!

Bio: 

Sanyam Bhutani is a Sr Data Scientist and Kaggle Grandmaster at H2O where he drinks chai and makes content for the community. When not drinking chai, he is to be found hiking the Himalayas, often with LLM Research papers. For the past 6 months, he has been writing about Generative AI everyday on the internet. Before that he has been recognised for his #1 Kaggle Podcast: Chai Time Data Science and also widely known on the internet for “maximising compute per cubic inch of an ATX case” by fixing 12 GPUs into his home office.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google