Abstract: Data visualization is a powerful tool for facilitating confident, informed decision-making. ggplot2 is one of the most popular data visualization packages in use today. Based on comprehensive grammar and syntax, ggplot2 gives you the ability to create data visualizations quickly and iteratively, whether it's a simple bar-chart or a complicated network analysis.
This workshop will teach you how to manipulate and structure your data for visualizations, graph elements, and their associated terminology, how to select the appropriate graph based on your data, and how to avoid common graphing mistakes. You will also learn how to customize data visualizations and give them the 'personal touches' that make them memorable to your audience.
Lesson 1: Structuring Data for Visualizations
Data are rarely collected or stored in a format that's ready for creating visualizations. In this lesson, we'll cover how to reshape your data into a graph-friendly structure, how to convert columns into different formats, and how to quickly count, summarize, and inspect values before creating graphs.
Lesson 2: The Grammar of Graphics
Practice building graphs layer-by-layer using the grammar of graphics. We will work through a series of charts step-by-step, from basic plots to complex, polished visualizations. You'll learn how to 'map' columns in your dataset to 'aesthetics,' choose different graphs types (geoms) based on the data type, and how to clearly (and concisely!) label your graph.
Lesson 3: Customizing Visualizations
You will learn how to build a wide range of graphs using ggplot2 and refine plots for effective presentation. You will be able to highlight important aspects of your data, label and annotate key features, and customize their overall appearance. We will conclude with some tips on and strategies for presenting visualizations to different audiences.
R programming (basic knowledge) and a Github account (optional)
Bio: Peter is a hands-on data science leader with a business focused approach to building data science solutions and telling stories with data. Experienced in translating business problems into data products using advanced statistical techniques and ML to support decision making in a variety of rapid growth environments. Scaled data science solutions for user acquisition, retention, channel optimization, revenue and fraud at Lyft, Alibaba and Citrix. Currently leading Marketing Science for Growth at Nextdoor.