Abstract: An important practical challenge is to develop theoretically-principled tools that can be used to guide the use of production-scale deep neural networks. We will describe recent work that has focused on using spectral-based methods from scientific computing and statistical mechanics to develop such tools. Among other things, these tools can be used to develop metrics characterizing the quality of models, without even examining training or test data; and they can be used to predict trends in generalization (and not just bounds on generalization) for state-of-the-art production-scale models. Related tools can be used to exploit adversarial data to characterize and modify the curvature properties of the penalty landscape and to perform tasks such as model quantization in a more automated way. We will cover basic ideas underlying these methods and illustrate their use for analyzing production-scale deep neural networks in computer vision, natural language processing, and related areas, and we will walk participants through how to use these tools, as implemented in the publicly-available "weightwatcher" python package.
Bio: Michael Mahoney is at ICSI and Department of Statistics at UC Berkeley. He works on algorithmic and statistical aspects of modern large-scale data analysis. He is a leader in Randomized Numerical Linear Algebra; he led the largest large-scale empirical evaluation of community structure in social and information networks; he has developed implicit regularization methods and scalable optimization methods for convex and non-convex problems; and he has applied these methods and complementary RMT methods to DNN problems.