Abstract: Generative AI (GAI) systems such as ChatGPT have revolutionised the way we interact with AI systems. These models can provide precise and detailed answers to our information needs, expressed in the form of brief text-based prompts. However, some of the responses generated by the GAI systems can contain harmful social biases such as gender or racial biases. Detecting and mitigating such biased responses is an important step towards establishing user trust in GAI. In this talk, I will describe the latest developments in methodologies that can be used to detect social biases in texts generated by GAI systems. In particular, I will describe methods that can be used to detect social biases expressed not only in English but other languages as well, with minimal human intervention. This is particularly important when scaling social bias evaluation for many languages. Second, I will describe methods that can be used to mitigate the identified social biases in large-scale language models. Experiments show that although some of the social biases can be identified and mitigated with high accuracy, the existing techniques are not perfect and indirect associations remain in the generative NLP models. Finally, I will describe on-going work in the NLP community to address these shortcomings and develop not only accurate but also trustworthy AI systems for the future.
Bio: Danushka Bollegala is a Professor in the Department of Computer Science, University of Liverpool, UK. He obtained his PhD from the University of Tokyo in 2009 and worked as an Assistant Professor before moving to the UK. He has worked on various problems related to Natural Language Processing and Machine Learning. He has received numerous awards for his research excellence such as the IEEE Young Author Award, best paper awards at GECCO and PRICAI. His research has been supported by various research council and industrial grants such as EU, DSTL, Innovate UK, JSPS, Google and MSRA. He is an Amazon Scholar.