bayesplot

Diagnostic plots for Bayesian models

TJ Mahr
UW–Madison Waisman Center

Hello! 👋

I study how children with motor disorders learn to speak and communicate.

Bayesian stats let me handle repeated-measures, time-series data from heterogeneous populations.

My current modeling project

My current project looks at speech intelligibility (y) changes with age (x). The figure shows a spaghetti plot of model fits and observed data for one child, showing a nice fit to the data. The right shows three histograms that describe when the lines cross various intelligibility thresholds.

My current project looks at speech intelligibility (y) changes with age (x). The figure shows a spaghetti plot of model fits and observed data for one child, showing a nice fit to the data. The right shows three histograms that describe when the lines cross various intelligibility thresholds.

To get my cool model to work, I needed diagnostics…

  • Plotting functions for visual diagnostics and model criticism
  • Part of the Stan universe but works with generic MCMC samples
  • Built on top of ggplot2
  • Simple functions to make routine visualization easy
  • https://mc-stan.org/bayesplot/

Scottish Hill races

Try to predict race time from race distance and hill height.

stan_glm(time_min ~ distance_km, data = races, ...)

Bayesian models in 15 seconds

Classical regression: line of best fit (maximum likelihood)

Bayesian regression: all plausible lines given data and data-generating process (posterior distribution)

Model is a distribution

Marginal distributions of parameters

Three facets showing histograms of posterior samples for the model’s intercept, main predictor (distance) and the error term sigma.

Uncertainty/compatibility intervals

Plot showing the median and two compatibility intervals for each parameter. We use this compare the sign and magnitude of model parameters.

Maybe you can do better? Go for it.

Intervals plus density

Like the previous interval plot but with density curves drawn instead.

Ridgelines help hierarchical models

A set of several partially overlapping density curves with shaded areas showing 80% intervals.

Joint distributions

A scatterplot of posterior draws of the intercept and distance effect with contour lines overlaid.

Hex bin

Another 2-d density plot but this one uses hexagonal tiles and uses shading to show density.

Model is generative

Bayesian models are generative

  • You specify a data-generating process.
  • Model provides a sample of parameter values for the process that are compatible with the data.

Posterior predictive checks

  • On each draw of posterior distribution, have the model re-predict the original dataset.
  • Does the replicated data look like the original data?

Boxplot of observed versus 6 replications

Boxplot of observed versus 6 replications

Density of observed versus 50 replications

Density of observed versus 50 replications. The model replications do not agree with the data.

Density from a better model

Density of observed versus 50 replications. The model replications agree with the data.

How well are individual data points predicted?

Plot of the observed data by distance. For each observation, there is a 95% interval showing the model’s range of simulations. As the distance increases, the intervals get farther from the observations, more or less.

Pointwise prediction error

Instead of showing observed versus simulation, this shows the average of observed minus simulated. The x axis is hill height. A LOESS smooth shows that error increases with hill height.

Model’s distribution comes from a sampling algorithm

  • Bayesian models are estimated by Markov Chain Monte Carlo.
  • Multiple chains sample the posterior distribution in parallel.
  • Did these chains adequately sample the posterior distribution?

Classic traceplot 🐛

The canonical traceplot. It looks like a hairy caterpillar. It’s good.

Traceplot with bad mixing of chains

Traceplot where one of the chains gets stuck. It’s bad.

New diagnostics are coming

Figure showing the abstract of the new Rhat paper. https://arxiv.org/abs/1903.08008

Figure showing the abstract of the new Rhat paper. https://arxiv.org/abs/1903.08008

[wip] Do ranks mix well among chains?

Figure showing mixture of rankings among the chains from the bad traceplot. Chain 2 dominates one end of the rankings.

Plus dozens more plots

https://mc-stan.org/bayesplot/

Acknowledgments

  • Shoutout to Jonah Gabry, the lead author of the package
  • Rest of Stan team.
  • My work is supported by NIH R01DC009411, R01DC015653

https://github.com/tjmahr/bayesplot-satrdays-2019