R for Python Programmers
R for Python Programmers

Abstract: 

Should a data scientist use R or should they use Python? The answer to this rather delicate question is, of course, they should know a bit each and use the most appropriate language for the task in hand. In this tutorial, we’ll take participants through the R tidyverse, one of the (many!) areas that R shines. The tidyverse is essential for any data scientist who deals with data on a day-to-day basis. By focusing on small key tasks, the tidyverse suite of packages removes the pain of data manipulation.

Session Outline
Should a data scientist use R or should they use Python? The answer to this rather delicate question is, of course, they should know a bit each and use the most appropriate language for the task in hand. In this tutorial, we’ll take participants through the R tidyverse, one of the (many!) areas that R shines. The tidyverse is essential for any data scientist who deals with data on a day-to-day basis. By focusing on small key tasks, the tidyverse suite of packages removes the pain of data manipulation. We'll cover some of the core features of the tidyverse, such as dplyr (the workhorse of the tidyverse), string manipulation, graphics and the concept of tidy data.

Goals: by the end of the tutorial participants will be able to

* install and load R packages;
* import and export data into R;
* construct graphics using the ggplot2 package;
* use the open-source RStudio IDE;
* identify strengths and weaknesses in R compared to Python;
* manipulate data into a tidy format using the tidyverse suite of R package;
* connect directly with databases using dplyr.

This tutorial assumes no prior knowledge of R, but we do assume prior knowledge of another programming language. The course will highlight the similarities and differences between R and Python, allowing participants to build on their existing Python knowledge while avoiding gotchas.

Participants should pre-install R and RStudio before the course.

Background Knowledge
Basic programming

Bio: 

Colin Gillespie is the co-founder of Jumping Rivers, a data science & machine learning consultancy. He has been using R since 1999 and Python since 2002.
He's the author of a number of R packages and has published the book Efficient R Programming with O'Reilly.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google