Abstract: Natural language processing (NLP) applications such as chat bots, machine translation systems, text summarization systems, information extraction system etc. have seen significant performance boosts over the last decade, thanks to accurate methods for representing texts such as using large scale language models (e.g. BERT, GPT-3, RoBERTa etc.). However, social biases such as gender, racial and ethnic biases have been also identified in text representations produced by these large scale masked language models. It is problematic to use such biased language models in real-world NLP systems, interacted by millions of users world-wide on a daily basis because social biases encoded in the text representations propagate into those systems, and make unfair discriminatory decisions/responses. In this talk, I will first describe methods developed in the NLP community to detect the types and levels of social biases learnt by large-scale language models. Next, I will present techniques that can be used to mitigate such biases.
Bio: Danushka Bollegala is a Professor in the Department of Computer Science, University of Liverpool, UK. He obtained his PhD from the University of Tokyo in 2009 and worked as an Assistant Professor before moving to the UK. He has worked on various problems related to Natural Language Processing and Machine Learning. He has received numerous awards for his research excellence such as the IEEE Young Author Award, best paper awards at GECCO and PRICAI. His research has been supported by various research council and industrial grants such as EU, DSTL, Innovate UK, JSPS, Google and MSRA. He is an Amazon Scholar.