Abstract: There is an increasing need to bring machine learning to a diverse set of hardware devices. Current approaches typically rely on vendor-specific operator libraries and frameworks, and require significant engineering effort. In this talk we will present an overview of the Apache TVM open source stack, which exposes graph- and operator-level optimizations to provide performance portability for machine learning workloads across diverse hardware back-ends. TVM solves compiler optimization challenges by employing a learning-based approach for rapid exploration of optimizations, saving months of engineering time and offering state-of-the-art performance in both edge and server use cases. We will discuss how TVM offers broad model coverage, and makes effective use of hardware resources. We will end the talk with a sneak peek at OctoML's Octomizer, a SaaS platform for continuous model optimization, benchmarking, and packaging.
Bio: Luis Ceze is a Co-founder and CEO at OctoML and Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington. His research focuses on the intersection between computer architecture, programming languages, machine learning and biology. His current research focus is on approximate computing for efficient machine learning and DNA-based data storage. He co-directs the Molecular Information Systems Lab (misl.bio) and the Systems and Architectures for Machine Learning lab (sampl.ai). He has co-authored over 100 papers in these areas, and had several papers selected as IEEE Micro Top Picks and CACM Research Highlights. His research has been featured prominently in the media including New York Times, Popular Science, MIT Technology Review, Wall Street Journal, among others. He is a recipient of an NSF CAREER Award, a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship, the 2013 IEEE TCCA Young Computer Architect Award, the 2020 ACM SIGARCH Maurice Wilkes Award and UIUC Distinguished Alumni Award.