3
$\begingroup$

I am using R to build the random structure of my model but I am ending up with a very complex model. Currently looks like this:

Model <- lmer(x ~ y * z * d * k + (1 + y * z + d | subject), data = Data, REML = FALSE, control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 100000))) 

I would like to know if I am simply overfitting. How can I get a predictive R-squared for linear mixed effects models? Is there a way to calculate these values?

I am aware of the package MuMIn for getting Rsquared values but I am concerned with overfitting, so I wanted to see if the degrees of freedom are biasing too much the AIC and p-values when comparing the models using anova.

$\endgroup$
4
  • $\begingroup$ R-squared won't help you. You need either CV or a test set. $\endgroup$ Commented Sep 16, 2019 at 6:55
  • $\begingroup$ I want the predictive R-squared, not the R-squared as I know the R-squared increases with the number of predictors. Do you think AIC would be enough? $\endgroup$ Commented Sep 16, 2019 at 14:24
  • $\begingroup$ You just need to decide what criterion you want to use. AIC or BIC or AICc are fine to use as model selection criteria. There's also adjusted r-squared. There are also techniques like LASSO and ridge regression. They don't all agree, so you have to decide what criterion is important for your purposes. $\endgroup$ Commented Sep 16, 2019 at 22:21
  • 2
    $\begingroup$ @CatM I don't know how the predictive R-squared is estimated, but given that mixed models already have problems estimating the normal / adjusted R-squared, this may be a hard task. I would use a more common approach, such as AIC or a test set predictions. $\endgroup$ Commented Sep 17, 2019 at 11:34

1 Answer 1

1
$\begingroup$

This is an interesting question... The approach for calculating predictive r-square for linear models is given at this rpubs page. It won't work directly for mixed models. There is an influence function in the car package for mixed models, but I couldn't figure out how to adapt this to the purpose.... If I understand predictive r-squared, probably the most fruitful approach would be to write a function that removes data observation by observation and sees how well the model predicts the dropped observation. I didn't see anything that addresses how this would be done specifically.

$\endgroup$
1
  • $\begingroup$ Yes, but the removal should probably be group-by-group rather than observation-by-observation (and good luck with crossed random effects) $\endgroup$ Commented May 30, 2021 at 19:48

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.