Abstract: Data visualization is a powerful tool for facilitating confident, informed decision-making. ggplot2 is one of the most popular data visualization packages in use today. Based on comprehensive grammar and syntax, ggplot2 gives you the ability to create data visualizations quickly and iteratively, whether it's a simple bar-chart or a complicated network analysis.
This workshop will teach you how to manipulate and structure your data for visualizations, graph elements, and their associated terminology, how to select the appropriate graph based on your data, and how to avoid common graphing mistakes. You will also learn how to customize data visualizations and give them the 'personal touches' that make them memorable to your audience.
Lesson 1: Structuring Data for Visualizations
Data are rarely collected or stored in a format that's ready for creating visualizations. In this lesson, we'll cover how to reshape your data into a graph-friendly structure, how to convert columns into different formats, and how to quickly count, summarize, and inspect values before creating graphs.
Lesson 2: The Grammar of Graphics
Practice building graphs layer-by-layer using the grammar of graphics. We will work through a series of charts step-by-step, from basic plots to complex, polished visualizations. You'll learn how to 'map' columns in your dataset to 'aesthetics,' choose different graphs types (geoms) based on the data type, and how to clearly (and concisely!) label your graph.
Lesson 3: Customizing Visualizations
You will learn how to build a wide range of graphs using ggplot2 and refine plots for effective presentation. You will be able to highlight important aspects of your data, label and annotate key features, and customize their overall appearance. We will conclude with some tips on and strategies for presenting visualizations to different audiences.
R programming (basic knowledge) and a Github account (optional)
Bio: Martin is a Senior Clinical Programmer at BioMarin, where he builds dashboards and tools for making data-informed decisions. Previously, Martin built statistical tools and dashboards for the Diabetes Technology Society, a contributing author for Data Journalism in R on the Northeastern University School of Journalism blog/website, and other volunteer and non-profit organizations. He's a data journalism instructor for California State University, Chico. Martin holds a graduate degree in Clinical Research and is passionate about data literacy and open source technologies.