I am trying to determine whether my response count data are too overdispersed for a (brms) Bayesian poisson model. I constructed a poisson-generated response variable with low and high levels of noise/dispersion, and I ran negative binomial models:
library(data.table) library(brms) set.seed(72) dt=data.table(predictor=rpois(60,lambda=15)) hist(dt[,predictor],breaks=10) dt[,response:=predictor+round(rnorm(nrow(dt),0,sd=1))] hist(dt[,response],breaks=10) dt[,response_overdisp:=abs(predictor+round(rnorm(nrow(dt),0,sd=10)))] hist(dt[,response_overdisp],breaks=10) bm0.nb=brm(response~predictor,dt,family="negbinomial") bm0.over.nb=brm(response_overdisp~predictor,dt,family="negbinomial") There is already a similar (unanswered) question, but related to how to do it in JAGS.
I saw the advice of the author of the brms package to compare the poisson model against one that includes an observation-level random effect on GitHub based on information criteria, but that seems like an indirect indication of over-dispersion.
So I was wondering whether there are other approaches to get more direct evidence about the overdispersion for bayesian models, using the posterior samples of the estimated shape parameter (which, I thought, could be equivalent to what dHARMA does)?
What I tried so far is to extract the predicted means and the shape parameter posterior distribution, compute the dispersion parameter, plot it, and test the probability that it is greater than 1:
model0=bm0.nb #different models can be tested shape_post=posterior_samples(model0)$shape means_post=rowMeans(posterior_predict(model0)) dispersion_post=1+means_post/shape_post hist(dispersion_post,xlim=c(0.9,max(dispersion_post))) abline(v=1,col="red") hypothesis(model0,paste("1+",mean(means_post),"/shape>1",sep=""),class=NULL) It turns that since the dispersion parameter has a lower bound at 1, and can only be greater, I always find that its credible interval is above 1. I understand that if I set a different threshold level for the dispersion parameter (say, 1.2), I could determine overdispersion based on that, but I found no consensus threshold value for that, and other dispersion tests seem to simulate or compute dispersion parameters that can go below 1. So I did not post this as an answer because it is not satisfactory yet.

