Abstract: When you have thousands of model versions, each written in any mix of frameworks (R/Java/Ruby/SciKit/Caffe/Tensorflow on GPUs etc), how do you efficiently deploy them as elastic, scalable, secure APIs with 10ms of latency?
ML has been advancing rapidly, but only a few contributors are focusing on the infrastructure and scaling challenges that come with it. We've built, deployed, and scaled thousands of algorithms and machine learning models, using every kind of framework. We've seen many of the challenges faced in this area, and in this talk I'll share some insights into the problems you’re likely to face, and how to approach solving them.
In brief, we’ll examine the need for, and implementations of, a complete ""Operating System for AI"": a common interface for different algorithms to be used and combined, and a general architecture for serverless machine learning which is discoverable, versioned, scalable and sharable.
Bio: Jon Peck is a full-stack developer with two decades of industry experience, who now focuses on bringing scalable, discoverable, and secure machine-learning microservices to developers across a wide variety of platforms via Algorithmia.com