Abstract: Amid the explosion of interactive data visualization options, Matplotlib remains the go-to for quick data visualization for researchers working in python, either directly or via libraries like Pandas and Seaborn. It also remains one of the most robust and flexible tools for customizing static visualizations. This tutorial will discuss ways to level-up your quick plots for use in published papers, automated reporting, and any other scenario where crisp and/or custom visualizations are called for.
We will cover some foundations of the Matplotlib interface, and walk through sample code for progressively more polished and elaborate visualizations. Along the way, we'll tie each change to the relevant best-practices in data visualization, in areas such as:
* Axes limits, labels, and scales
* Use (and misuse) of color
* Standardizing layouts, fonts, and other formatting
* Useful annotation and labeling vs hard-to-understand clutter
* Reusing custom visualizations on new data
Finally, we will review some useful options for incorporating these visualizations into finished products, from exporting Jupyter notebooks to full academic-style LaTeX papers.
Comfort with Python and Pandas are assumed, and any level of familiarity with other tools is ok.
Bio: Melanie Veale, Ph.D. is a recovering Astrophysicist, currently working as a Data Solutions Architect at Anomalo. Her Ph.D. research on galaxy dynamics introduced her to statistical and computational python, as well as other languages and tools like C++, Fortran, IDL, R, bash, SLURM, and others. She has also dabbled in AWS infrastructure, Kubernetes, Docker, Spark, Ray, Dask and more as a Field Engineer and Field Data Scientist at Domino Data Lab, helping analytics and machine learning teams modernize their collaboration and deployment workflows. Nowadays she is a troubleshooting enthusiast anywhere on the Data, Analytics, and MLOps tech stacks, and enjoys melding her passions for crisp technical communication, good visualizations, and first-principles thinking into helping organizations get the most out of their data.