# Posterior predictive checks

The main idea behind posterior predictive checking is the notion that, if the model fits, then replicated data generated under the model should look similar to observed data.

Replicating data

Assume you have a model ${M}$ with unknown parameters ${\theta}$. You fit ${M}$ to data ${y}$ and obtain the posterior distribution ${\pi(\theta|y)}$. Given that

$\displaystyle \pi(y_{rep}|y) = \int \pi(y_{rep}|\theta)\pi(\theta|y)d\theta$

we can simulate ${K}$ sets of replicated datasets from the fitted model by simulating ${K}$ elements ${\{\theta^{(1)}, ..., \theta^{(K)}\}}$ from the joint posterior distribution ${\pi(\theta|y)}$ and then for each ${\theta^{(i)}, i = 1,...,K}$, we simulate a dataset ${y_{rep}^{(i)}}$ from the likelihood ${\pi(y|\theta)}$.

Notice that under simulation-based techniques to approximate posterior distributions, such as MCMC, we already have draws from the posterior distribution, and the only extra work is in simulating ${y_{rep}^{(i)}}$ from ${\pi(y|\theta)}$ for each draw from the posterior distribution.

Test quantities and tail probabilities

We measure the discrepancy between model and data by defining test quantities (or discrepancy measures) ${T(y, \theta)}$ to compute aspects of the data we want to check. Then, the posterior predictive p-value (or Bayesian p-value), which is defined as the probability that the replicated data could be more extreme than the observed data, as measured by the test quantity ${T}$ is given by:

$\displaystyle p_B = Pr(T(y_{rep}, \theta) \geq T(y, \theta)|y)$

In contrast to the classical approach, the test statistic used to compute the Bayesian p-value can depend not only on data ${y}$, but also on the unknown parameters ${\theta}$. Hence, it does not require special methods for dealing with nuisance parameters. Also, a (bayesian) p-value is a posterior probability and can therefore be interpreted directly, although not as Pr(model is true|data).

Choice of test statistics

One important point in applied statistics is that one model can be adequate for some purposes and inadequate for others, so it is important to chose test statistics that check relevant characteristics of the model for a given application. One should protect against errors in the model that would lead to bad consequences regarding the objective under study. The choice of appropriate test statistics are therefore highly dependent on the specific problem at hand.

References:

[1] Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2003). Bayesian data analysis. CRC press (chapter 6).