Model Registry with Open Source Tools: Git, GitHub and CI/CD

Abstract: 

Model Registry is becoming an essential part of machine learning technology stack. It helps to keep track of ML models for a team, connect ML models to production environments and manage model lifecycle. However, in many cases it requires an additional model registry SaaS service to store information about ML models. In many cases, this additional service leads to a divergence in the lifecycle of ML models and software applications.In this talk, we will show how ML engineers and data scientists can implement model registry using open source technologies such as Git, GitHub/GitLab and how they can manage ML model the same way as a software application without any additional services.

We will show how:
Git can be used as the source of truth for models, model versions and model statuses
Model lifecycle can be managed through GitHub Pull Requests or GitLab Merge Requests
CI/CD systems can deliver ML models to production

Advanced, practical use cases include:
Mono-repositories with ML models and model zoos
Tracking large models and storing weights files to cloud storages such as S3, GCS and Azure Blob Store
Model versioning
Connection to model deployment systems

The proposed model registry is based on software engineering best practices and ideas of GitOps. This makes model lifecycle compatible with software application lifecycle that simplifies operations around ML and software development teams.The talk requires basic knowledge of Git and GitHub/GitLab. After the talk, listeners will be able to implement fully functional model registries without any additional services with popular open source tools.

Bio: 

Dmitry Petrov is an ex-Data Scientist at Microsoft with Ph.D. in Computer Science and active open source contributor. He has written and open sourced the first version of DVC.org - machine learning workflow management tool. Also he implemented Wavelet-based image hashing algorithm (wHash) in open source library ImageHash for Python. Now Dmitry is working on tools for machine learning and ML workflow management as a co-founder and CEO of Iterative in San Francisco.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google