Sebastian Gehrmann, PhD

Sebastian Gehrmann, PhD

Head of NLP, Office of the CTO at Bloomberg

    Sebastian Gehrmann is the Head of NLP in the Office of the CTO at Bloomberg, where he contributes to and guides the strategy for the development of language technology across the company. His research interests range from natural language generation to model evaluation. He has worked on large language models like BloombergGPT, BLOOM, as well as PaLM and PaLM 2. Before joining Bloomberg, Sebastian was a senior researcher at Google. He holds a Ph.D. in computer science from Harvard University.

    All Sessions by Sebastian Gehrmann, PhD

    Day 3 04/25/2024
    12:10 pm - 12:40 pm

    Model Evaluation in LLM-enhanced Products

    <span class="etn-schedule-location"> <span class="firstfocus">LLMs</span> </span>

    Evaluation in machine learning (ML) product development is a rich topic with a long history. However, Large language models (LLMs) represent a significant deviation from the known path and introduce a lot of unknowns. Since the same LLM can be flexibly applied in a wide range of contexts both with and without additional tuning, its evaluation must reflect this increased scope. Moreover, since LLMs output natural language instead of discrete classes, we must shift our evaluation focus from classic metrics like accuracy and F1 scores to complex concepts like usefulness, attribution, factuality, and safety. Given this new paradigm, how can we build on long-standing best practices of evaluation, learn from academic research, and build solid evaluation pipelines for LLMs? Furthermore, we must consider the important role that humans play in model evaluations and determine what can be automated -- and whether it should be. In this talk, I will discuss these questions alongside common pitfalls, opportunities, and best practices related to including large language models as an additional ingredient in product development.

    Open Data Science




    Open Data Science
    One Broadway
    Cambridge, MA 02142

    Privacy Settings
    We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
    Consent to display content from - Youtube
    Consent to display content from - Vimeo
    Google Maps
    Consent to display content from - Google