
Abstract: In this talk, I will demonstrate how to train, optimize, and serve distributed machine learning models across various environments including the following:
1) Local Laptop
2) Kubernetes Cluster (Running Anywhere)
3) AWS's New SageMaker Service (Announced Last Week @ Re-invent)
I'll also present some post-training model-optimization techniques to improve model serving performance for TensorFlow running on GPUs. These techniques include 16-bit model training, neural network layer fusing, and 8-bit weight quantization.
Lastly, I'll discuss alternate runtimes for TensorFlow on GPUs including and TensorFlow Lite and Nvidia's TensorRT.
Bio: Chris Fregly is Founder and Research Engineer at PipelineAI, a Streaming Machine Learning and Artificial Intelligence Startup based in San Francisco. He is also an Apache Spark Contributor, a Netflix Open Source Committer, founder of the Global Advanced Spark and TensorFlow Meetup, author of the O’Reilly Training and Video Series titled, ""High Performance TensorFlow in Production.""
Previously, Chris was a Distributed Systems Engineer at Netflix, a Data Solutions Engineer at Databricks, and a Founding Member and Principal Engineer at the IBM Spark Technology Center in San Francisco.

Chris Fregly
Title
Founder and Research Scientist at PipelineAI, Apache Spark Contributor, Author of the upcoming book, Advanced Spark
