Probability Calibration: Why and How


It is often desirable that a model output *well-calibrated* probabilities. This tutorial will discuss why calibration is important, how to assess the calibration of a model, and how to calibrate by post-processing the scores. We will review calibration techniques such as Isotonic Regression, Platt scaling, Beta calibration, and Spline calibration and apply them to real-world examples.

This workshop will primarily feature Jupyter Notebooks, with slides used to illustrate some concepts in more detail. The jupyter notebooks will walk through real examples where we build a model, assess how well-calibrated they are, demonstrate various calibration methods, and assess the results.

Section 1: Why calibration?
- What it means for model outputs to be *well-calibrated*.
- Why and when it is important (or not).
- Specific scenarios where calibration may be valuable.

Section 2: Assessing the model
- How to determine if the model is well-calibrated.
- Reliability diagrams and how to use them.
- Issues with calibration for values close to 0/1.

Section 3: Calibrating the model
- Illustration of the various techniques:
- Isotonic Regression
- Beta Calibration
- Platt Scaling
- Spline Calibration
- Demonstrating their use and results on real data.
- Tradeoffs between the approaches.
- Calibrating multi-class models.

Section 4: Assessing the calibration
- Did the calibration improve model performance?
- Are there flaws in the calibration?
- How to adjust the calibration and improve further.


Brian Lucena is Principal at Numeristical, where he advises companies of all sizes on how to apply modern machine learning techniques to solve real-world problems with data. He is the creator of StructureBoost, ML-Insights, and the SplineCalib calibration tool. In previous roles he has served as Senior VP of Analytics at PCCI, Principal Data Scientist at Clover Health, and Chief Mathematician at Guardian Analytics. He has taught at numerous institutions including UC-Berkeley, Brown, USF, and the Metis Data Science Bootcamp.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google