AI Development Lifecycle: Learnings of What Changed with LLMs

Abstract: 

When comparing the building of models/pipelines based on LLMs vs more traditional machine/deep learning, two observations are drawn:

- Building Proof of Concepts has become incredibly easy
- Evaluation is much more challenging

As a result, the evaluation step is often neglected leading to pointless iterations and a lack of knowledge on the true performance of the product. This is one of the main obstacle to moving into production, especially in circumstances where we need high accuracy of the results.

In this talk, we will explore the lessons learned from building products that are typical use cases of these technologies, such as a financial document analysis automation and a RAG (retrieval augmented generation) tool for a medical company.

We will mostly focus on the essential steps of the development workflow which are often overlooked: dataset collection, evaluation and monitoring. New tools for monitoring and doing manual/automatic evaluation (LLM as a judge) have been released to ease the implementation of these best practices in the context of LLMs, and help product experts assist the technical team in building these products.

Session outcomes/learning:

- The importance of a good data methodology to succeed in building a performant model/pipeline based on LLMs (e.g. RAG)
- The method / tools to cautiously evaluate and monitor an LLM

Open source tools used:
- Langfuse, promptfoo

Bio: 

Noé is an Engineering Manager (for Data Science projects) at Sicara, where he worked on a wide range of projects mostly related to vector databases, computer vision, prediction with structured data and more recently LLMs. He is currently leading the GenAI development in the company. You can find all his talks and articles here: https://www.sicara.fr/en/noe-achache

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google