Carolyn Rosé, PhD

Carolyn Rosé, PhD

Professor, Program Director for the Masters of Computational Data Science at Carnegie Mellon University

    Her research focuses on better understanding the social and pragmatic nature of conversation, and using this understanding to build computational systems that improve the efficacy of conversation both between people, and between people and computers. In order to pursue these goals, she invokes approaches from computational discourse analysis and text mining, conversational agents, and computer-supported collaborative learning. Carolyn grounds her research in the fields of language technologies and human-computer interaction, and am fortunate to work closely with students and post-docs from the LTI and the Human-Computer Interaction Institute, as well as to direct my own lab, TELEDIA.

    All Sessions by Carolyn Rosé, PhD

    Day 2 04/24/2024
    9:30 am - 9:55 am

    Setting Up Text Processing Models for Success: Formal Representations versus Large Language Models

    <span class="etn-schedule-location"> <span class="firstfocus">LLMs</span> </span>

    With increasingly vast storehouses of textual data readily available, the field of Natural Language Processing offers the potential to extract, organize, and repackage knowledge revealed either directly or indirectly. Though for decades one of the holy grails of the field has been the vision of accomplishing these tasks with minimal human knowledge engineering through machine learning, with each new wave of machine learning research, the same tensions are experienced between investment in knowledge engineering and integration know-how on the one hand and production of knowledge/insight on the other hand. This talk explores techniques for injecting insight into data representations to increase effectiveness in model performance, especially in a cross-domain setting. Recent work in neural-symbolic approaches to NLP is one such approach, in some cases reporting advances from incorporation of formal representations of language and knowledge and in other cases revealing challenges in identifying high utility abstractions and strategic exceptions that frequently require exogenous data sources and the interplay between these formal representations and bottom-up generalities that are apparent from endogenous sources. More recently, Large Language Models (LLMs) have been used to produce textual augmentations to data representations, with more success. Couched within these tensions, this talk reports on recent work towards increased availability of both formal and informal representations of language and knowledge as well as explorations within the space of tensions to use this knowledge in effective ways.

    Day 2 04/24/2024
    9:30 am - 9:55 am

    ODSC Keynote: Setting Up Text Processing Models for Success: Formal Representations versus Large Language Models

    <span class="etn-schedule-location"> <span class="firstfocus">LLMs</span> </span>

    With increasingly vast storehouses of textual data readily available, the field of Natural Language Processing offers the potential to extract, organize, and repackage knowledge revealed either directly or indirectly. Though for decades one of the holy grails of the field has been the vision of accomplishing these tasks with minimal human knowledge engineering through machine learning, with each new wave of machine learning research, the same tensions are experienced between investment in knowledge engineering and integration know-how on the one hand and production of knowledge/insight on the other hand. This talk explores techniques for injecting insight into data representations to increase effectiveness in model performance, especially in a cross-domain setting. Recent work in neural-symbolic approaches to NLP is one such approach, in some cases reporting advances from incorporation of formal representations of language and knowledge and in other cases revealing challenges in identifying high utility abstractions and strategic exceptions that frequently require exogenous data sources and the interplay between these formal representations and bottom-up generalities that are apparent from endogenous sources. More recently, Large Language Models (LLMs) have been used to produce textual augmentations to data representations, with more success. Couched within these tensions, this talk reports on recent work towards increased availability of both formal and informal representations of language and knowledge as well as explorations within the space of tensions to use this knowledge in effective ways.

    Open Data Science

     

     

     

    Open Data Science
    One Broadway
    Cambridge, MA 02142
    info@odsc.com

    Privacy Settings
    We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
    Youtube
    Consent to display content from - Youtube
    Vimeo
    Consent to display content from - Vimeo
    Google Maps
    Consent to display content from - Google