Abstract: Deep neural networks can make overconfident errors and assign high confidence predictions to inputs far away from the training data. Well-calibrated predictive uncertainty estimates are important to know when to trust a model's predictions, especially for safe deployment of models in applications where the train and test distributions can be different. I'll first present some concrete examples that motivate the need for uncertainty and out-of-distribution (OOD) robustness in deep learning. Next, I'll present an overview of our recent work focused on building neural networks that know what they don’t know: this includes methods which improve single model uncertainty (e.g. spectral-normalized neural Gaussian processes), methods which average over multiple neural network predictions such as Bayesian neural nets and deep ensembles, and methods that leverage better representations (e.g. pre-trained transformers for improving “near-OOD” detection).
The session will help the attendees develop intuitions for better understanding the problem and some simple techniques to improve performance in practice (e.g. demo colabs).
- Understand how to measure the quality of uncertainty and robustness.
- Understand how to improve uncertainty/robustness in a single model.
- Understand how to combine multiple models (ensembles, Bayesian NNs) to further improve uncertainty/robustness.
- Leverage recent advances in representation learning (pre-training, transformers, etc).
Some representative talks:
Practical recipes for improving uncertainty/robustness and building models that ""know what they don't know"": http://www.gatsby.ucl.ac.uk/~balaji/balaji-siam-uq-2022-invited-talk.pdf
Tutorial on uncertainty in deep learning at CIFAR summer school :
Bio: Balaji is currently a Staff Research Scientist at Google Brain working on Machine Learning and its applications. Previously, he was a research scientist at DeepMind for 4.5+ years. Before that, he received a PhD in machine learning from Gatsby Unit, UCL supervised by Yee Whye Teh.
His research interests are in scalable, probabilistic machine learning. More recently, he has focused on:
- Uncertainty and out-of-distribution robustness in deep learning
- Deep generative models including generative adversarial networks (GANs), normalizing flows and variational auto-encoders (VAEs)
- Applying probabilistic deep learning ideas to solve challenging real-world problems.
Balaji Lakshminarayanan, PhD
Staff Research Scientist | Google Brain