Data-driven ML Retraining with Production Insights


It’s practically dogma today that a model's best day in production will be its first day in production. Over time model performance degrades, and thare are many variables that can cause decay, from real-world behavior changes to data drifts. When models misbehave, we often turn to retraining to fix the problem, but is the most recent data the best data to resolve our model performance issues and get it back on track? We all acknowledge the need for data-driven machine learning monitoring that pinpoints anomalies and uncovers their root cause so we can resolve issues quickly before they impact the business. When it comes to resolution through retraining, data selection and the retraining strategy selected are less than data-driven. Today when faced with retraining, many data teams simply select the last month or two of data to retrain on and hope that fresh really is best.

In this talk, we'll showcase, through ML monitoring and notebooks, how data scientists and ML engineers can leverage ML monitoring to find the best data and retraining strategy mix to resolve machine learning performance issues. This data-driven, production-first approach enables more thoughtful retraining selections, shorter and leaner retraining cycles, and can be integrated into MLOps CI/CD pipelines for continuous model retraining upon anomaly detection.

Session Outline:
What you will learn from this talk:
- Retraining groups and temporal similarity
- Drifted features and pre-preprocessing
- Drifted segments and model split
- Pipeline anomaly exclusion.

Public repo and notebook will be provided to attendees so they can leverage production-first retraining in their machine learning monitoring.


Oryan is a ֿLead Software Engineer with a passion for Machine Learning and DevOps, with 7 years of experience developing services for production and development environments and leading teams.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google