Abstract: Surrounding a typical ML pipeline many details are commonly swept under the rug. How will we monitor production data for concept drift? How do we measure false negative rate in production? How confident can we be of our performance assessments with a small test set and how should they be modified when faced with biased data? How can we ensure our model follows reasonable assumptions? We introduce a new general purpose tool, the Model Validation Toolkit, for common tasks involved in model validation, interpretability, and monitoring. Our utility has submodules and accompanying tutorials on measuring concept drift, assigning and updating optimal thresholds, determining credibility of performance metrics, compensating for data bias, and performing sensitivity analysis. In this session, we will give a tour of the framework's core functionality and some associated use cases.
Bio: Matthew Gillett is an Associate Director at FINRA who manages a team of Software Development Engineers in Test (SDET) across multiple projects. In addition to his primary focus in software development and assurance engineering, he also has an interest in various other technology topics such as big data processing, machine learning, and blockchain.