I have got a dataframe that contains three categorical predictors and one numerical response. I would like to compare their differences using posterior uncertainty intervals of MCMC draws. The reason for this is that the data has got many outliers that affect distribution and frequentest representations but are likely too important to be excluded (i.e. depictions of actual effect). So I would like to express my findings in such a way that says: there's a n% probability that my estimates capture the true effect.
I am still pretty new to the Bayesian methods so what I have done mostly relied on online material. I have tried using mcmc_intervals of the bayesplot package, the codes are:
fit1<-stan_glm(proportion~plot,data=faci_2,iter=1000,seed=0512) fit2<-stan_glm(proportion~year,data=faci_2,iter=1000,seed=0512) fit3<-stan_glm(proportion~type,data=faci_2,iter=1000,seed=0512) posterior1<- as.array(fit1) posterior2<- as.array(fit2) posterior3<- as.array(fit3) mcmc_intervals(posterior1) mcmc_intervals(posterior2) mcmc_intervals(posterior3) I was modelling them separately to emphasize individual effects (debatable, yes). I am thinking about controlling for other factors when the current issue has been sorted. Also, there is no assumption for the prior distribution (or uninformative).
So what I would get is a graph that looks like this: [
2
but I would like to include all categories in the CI graph, not with one of them treated as a baseline or whatever. I imagine categories are treated here the same way as I would dummy variables in a regression, so that one of them would actually be represented by the intercept. But it's not very intuitive as a result. So can someone show me the correct way to compare all of them using credible intervals? Or should I be using some other method to reach my goals? Thank you in advance, and do pardon me if I am not mathematically coherent here.
