Abstract: Deploying advanced Machine Learning technology to serve customers and/or business needs requires a rigorous approach and production-ready systems. This is especially true for maintaining and improving model performance over the lifetime of a production application. Unfortunately, the issues involved and approaches available are often poorly understood.
Large models make rigorous engineering and scalable architectures even more important. Just the size of the models themselves, and the datasets used for training, require highly efficient infrastructure. More complex pipeline topologies which include transfer learning, fine tuning, instruction tuning, and evaluation along a collection of complex dimensions, require a high degree of flexibility for customization.
Rigorous analysis of model performance at a deep level, including edge and corner cases is a key requirement of mission-critical applications. Measuring and understanding model sensitivity is also part of any rigorous model development process.
We discuss the use of ML pipeline architectures for implementing production ML applications, and in particular we review Google’s experience with TFX, and available tooling for rigorous analysis of model performance and sensitivity. Google uses TFX for large scale ML applications, and offers an open-source version to the ML community, which is actively extending TFX to add new features and components.
Bio: A data scientist and ML enthusiast, Robert has a passion for helping developers quickly learn what they need to be productive. Robert is currently the Senior Product Manager for TensorFlow Open-Source and MLOps at Google and helps ML teams meet the challenges of creating products and services with ML. Previously Robert led software engineering teams for both large and small companies, always focusing on moving fast to implement clean, elegant solutions to well-defined needs. You can find him on LinkedIn at robert-crowe.