Abstract: Generative Large Language Models like GPT4 have revolutionized the entire tech ecosystem. But what makes them so powerful? What are the secret components which make them generalize to a variety of tasks? In this talk, I will present how these foundation models are trained. What are the steps and core-components behind these LLMs? I will also cover how smaller, domain-specific models can outperform general purpose foundation models like ChatGPT on target use cases
Bio: Best known for developing state-of-the-art AI products such as the world’s first fully autonomous Conversational AI technology, the Alexa Prize (ChatGPT-like voice experience for Alexa users 5 years before ChatGPT), and Truth Checker AI, the first and currently only model to detect hallucinations generated by language models such as GPT-4.