1
$\begingroup$

I've got a dataset describing the infestation of an insect on plants in a very large area. The data were collected using several transects. Each transect is composed of 4 plots, with multiple plants monitored at each plot.

One single plant describes one datapoint. The dependent variable provides binary information on whether or not the plant was infested (1 = infestation; 0 = no infestation).

There are two independent variables: height above sea level and another continuous parameter describing the light situation at each plot.

Since I have some nested random factors (transect/plot design) and a binary dependent variable, I am currently using R to try to fit a binary mgcv::gamm() model to these data.

However, I recently learned that when checking the assumptions for using a binary GAM, the simulated residuals are important (not just the standardised residuals obtained using gam.check() because they do not carry much information). Simulated residuals for a mgcv::gam() can be checked using the DHARMa package and the simulateResiduals() function.

Unfortunately, I am currently stuck as DHARMa::simulateResiduals() does not work with mgcv::gamm() objects due to the random factors included in the model structure. Nevertheless, I need to include them because of my sampling design.

Is there any way to check the simulated residuals for an mgcv::gamm() object?

My code looks like this:

set.seed(1) mod2 <- gamm(BK ~ s(light) + s(height_asl), data = plant_level, family=binomial , random=list(transect_ID = ~ 1, plot_ID = ~ 1)) 

The results look plausible when plotted. However, I am unsure about the statistics behind this.

$\endgroup$
1
  • $\begingroup$ Unless you need to model auto-correlation or some more complex random effects as implemented in nlme, I don't see any reason to ever use gamm. You can fit your model with mgcv::gam. $\endgroup$ Commented Nov 17 at 13:46

1 Answer 1

0
$\begingroup$

Use the gam() component only for DHARMa checks:

library(DHARMa) sim_res <- simulateResiduals(fittedModel = mod2$gam, n = 1000) plot(sim_res) 

Or switch to gamm4:

library(gamm4) mod_gamm4 <- gamm4(BK ~ s(light) + s(height_asl), random = ~(1|transect_ID/plot_ID), family = binomial, data = plant_level) library(DHARMa) sim_res <- simulateResiduals(fittedModel = mod_gamm4$gam) plot(sim_res) 
$\endgroup$
2
  • $\begingroup$ Thank you for your answer! Unfortunately, I still don't understand the statistical background. Doesn't this approach ignore the random structure of the data due to the usage of only the gam() component? $\endgroup$ Commented Nov 17 at 13:29
  • $\begingroup$ Apart from the statistical background, I have tried both approaches, but neither works with DHARMa::simulateResiduals() Error in UseMethod("family"): no applicable method for 'family' applied to an object of class "gam" $\endgroup$ Commented Nov 18 at 12:30

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.