Abstract: The Einstein Prediction Builder modeling pipeline automates all steps of the end-to-end modeling process, from data auditing and feature engineering to model selection, for thousands of models across many different use cases. Although powerful, automated machine learning pipelines are inherently a black box that can be notoriously difficult to troubleshoot. This talk walks through a Data Science perspective of tracking and analyzing model metrics to address three major questions
1. How do we determine how to best spend our time when developing the modeling pipeline?
2. How to evaluate experiments for different modeling solutions?
3. How do we attain scale?
We will dig into some specific examples of common issues when modeling in production at scale: such as label leakage and score drift. We will discuss the metrics that we track to uncover some of these modeling issues and how the metrics framework is instrumental in helping to develop new features in a data-driven manner. Any changes we make to the modeling pipeline affect thousands of models, so a lot of care needs to be taken to ensure that changes do not cause regressions in some models. Lastly, we will discuss how we use monitoring and alerting , well-established practices in traditional engineering, to develop and attain scale in our modeling pipeline. Dashboards give us a global real-time view of our models and they help us to identify areas of improvement, while alerts draw our attention to the most pressing modeling issues and make it feasible to run a modeling pipeline at scale.
Bio: Eric Wayman is a Senior Data Scientist at Salesforce. As a member of the Einstein AI platform team, he works on developing the automated machine learning Pipeline for the recently released Einstein Prediction Builder, which helps Salesforce administrators leverage their Salesforce data to make predictions for use cases tailored to their individual needs. Before joining Salesforce, Eric worked as a Data Science Consultant at Pivotal Software and also did research in Probability and Stochastic Processes at UC Berkeley where he received his Ph.D. in Mathematics.
Eric Wayman, PhD
Senior Data Scientist | Salesforce
ai-for-engineers-europe19 | intermediate-europe19 | machine-learning-europe19 | europe-2019-talks