Managing your Data Science Git Repositories

Abstract: 

Data science teams differ from traditional software groups as the former often simultaneously tackle multiple short-lived projects. It’s not uncommon for a small group to have a few days to construct a dashboard. We have to balance the business need with long term reproducibility requirements. This workshop takes you through strategies that you can adopt.

Spoiler alert! There isn’t a perfect solution. Instead, there are multiple options, each with pros and cons. This training will help you decide what is best for your organisation and teams.

Session Outline:
Session 1: Redlines
What rules should you have across repositories? For example, should every repository have a README or CI file? If a repository must have a README, then what is the minimum standard? We’ll discuss strategies for using templating or auto-generating files.

Session 2: Merging, what could possibly go wrong?
Merging multiple branches with Git is a joy to behold - when it works. However, combining branches doesn’t always work out so nicely. We’ll discuss merging strategies, such as “to rebase or not to rebase,” as well as the potential pitfalls and benefits of adopting these strategies.

Session 3: Getting the most out of Git with CI
Continuous integration is fantastic. This last session discusses all the amazing ways you can leverage CI to optimise your workflow. From the standard CI use case of package checking to more exotic varieties, such as deployment, auto-tagging and linting your commit messages.

Background Knowledge:
Some basic familiarity with Git. For example, merging, commit, pushing.

Bio: 

Dr Colin Gillespie is the Co-Founder and CTO of Jumping Rivers. A data science consultancy that specialises in all things R and Python. He is also a Senior Statistics lecturer at Newcastle University, has published over eighty peer-reviewed papers, and co-authored the O'Reilly book, Efficient R programming.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google