Abstract: This tutorial will introduce you to the wonderful world of Bayesian data science through the lens of probabilistic programming in Python. In the first half of the tutorial, we will introduce the key concepts of probability distributions via hacker statistics, hands-on simulation, and telling stories of the data-generation processes. We will also cover the basics of joint and conditional probability, Bayes' rule, and Bayesian inference, all through hands-on coding and real-world examples. In the second half of the tutorial, we will use a series of models to build your familiarity with PyMC3, showcasing how to perform the foundational inference tasks of parameter estimation, group comparison (for example, A/B tests and hypothesis testing), and arbitrary curve regression. By the end of this tutorial, you will be equipped with a solid grounding in Bayesian inference, able to write arbitrary models, and have experienced basic model checking workflow.
After attending this tutorial, participants will:
* have a solid foundation of probability viewed through the lens of computational simulation and see how probability distributions can be matched to real-world data generating processes.
* understand how to use `numpy.random` to simulate draws from a probability distribution, use those simulations to calculate summary statistics, and use those summary statistics in testing hypotheses against data in a Bayesian fashion.
* understand how to use the probabilistic programming language PyMC3 to build arbitrary statistical models.
* be able to build and validate statistical models in a robust and principled fashion.
It would help if you knew:
* programming fundamentals and the basics of the Python programming language (e.g., variables, for loops);
* a bit about pandas and DataFrames;
* a bit about Jupyter Notebooks;
* your way around the terminal/shell.
However, I have always found that the most important and beneficial prerequisite is a will to learn new things so if you have this quality, you'll definitely get something out of this code-along session.
Bio: Hugo Bowne-Anderson is Head of Data Science Evangelism and VP of Marketing at Coiled, a company that makes it simple for organizations to scale their data science and machine learning in Python. He has extensive experience as a data scientist, educator, evangelist, content marketer, and a data strategy consultant at DataCamp, the online education platform for all things data. He also has experience teaching basic to advanced data science topics at institutions such as Yale University and Cold Spring Harbor Laboratory, conferences such as SciPy, PyCon, and ODSC and with organizations such as Data Carpentry. He has developed over 30 courses on the DataCamp platform, impacting over 500,000 learners worldwide through his own courses. He also created the weekly data industry podcast DataFramed, which he hosted and produced for 2 years. He is committed to spreading data skills, access to data science tooling, and open-source software, both for individuals and the enterprise.