1
$\begingroup$

My design has a total of 20 sites. 5 sites belong to each of four land covers: A, B, C and D. In each site, I have 5 sampling locations, 2 metres from each other. From each sampling location, I collected soil, and measured phosphatase activity in the soil from the sampling location.

I want to test for the effect of Land Cover type on soil phosphatase activity. Note that the land covers are NOT nested within a site. Rather, 5 sampling locations (pseudoreplicates) are nested within each site. Total 5 sites per land cover type. Total 4 land cover types.

I am confused regarding what exactly should be the random effects structure. Preliminary data exploration shows that the inter-Site variation is different in the different land cover types. What I understand from this is that the random effects should be different for the 4 land cover types.

In this case, what model specification do I need to use in nlme?

Activity ~ LandCoverType, random = 1|Site?

or

Activity ~ LandCoverType, random = LandCoverType|Site?

or some other way?

I am confused because I have read that the second kind of model specification means that the effect of LandCoverType on Activity depends on the Site.... However, this doesn't seem to make sense in my case, since I do not have all four land cover types in each Site!

Please Help!!!


Thanks for replying, rw2. You are right in thinking that I am not testing for inter-site variation. However, should I not try to account for the differences in inter-site variation for the different land covers? As in, inter-site variation in Activity is different for the land cover types A, B, C and D in my data...

In my understanding, the random effect (~1|Site) gives me the standard deviation of the distribution from which a random value will be selected, added to the fixed effect estimate of the LandCoverType, to get a site-level value of Activity. However, in this random effect specification, the random value will be selected from the same distribution, irrespective of the land cover type the site belongs to. If the inter-site variation in Activity is different for the different land cover types, then shouldn't this random value be taken from distributions of different standard deviations for different land cover types?

$\endgroup$

2 Answers 2

2
$\begingroup$

Just to give a more general answer for this sort of question, which is very common on this site:

The OP describes a situation in which we have sampling locations / measurements nested in a site. So observations are clustered by location. There is no mention of anything else about each observation other than LandCoverType, which seems to be variable at the Site level, not the observation level. That is, a site can only have one LandCoverType or stated another way, LandCoverType is a characteristic of a site, not a single observation.

Therefore, the only specification that can work is random = 1|Site. Since LandCoverType cannot vary within a Site, it makes no sense to use random = LandCoverType|Site, since the latter is a specification that allows for estimates of the variability of the effect of LandCoverType on Activity to vary across sites. In the random = LandCoverType|Site situation, we're sort of modeling a little regression for each Site and combining the results (in a precision-weighted average) as an estimate of the linear relationship.

Stated mathematically, Activity ~ LandCoverType, random = 1|Site represents:

$$ Activity_{ij} = \beta_{0j} + r_{ij} \\ \beta_{0j} = \gamma_{00} + \gamma_{01}LandCover + u_{0j} $$

or

$$ Activity_{ij} = \gamma_{00} + \gamma_{01}LandCover + u_{0j} + r_{ij} $$

where $\gamma_{01}$ represents the linear relationship between LandCoverType and mean Activity per site, and $u_{0j}$ is the site-level random-effect.

This specification: Activity ~ LandCoverType, random = LandCoverType|Site implies moving LandCoverType to level 1:

$$ Activity_{ij} = \beta_{0j} + \beta_{1j}LandCover + r_{ij} \\ \beta_{0j} = \gamma_{00} + u_{0j} \\ \beta_{1j} = \gamma_{10} + u_{1j} $$

or

$$ Activity_{ij} = \gamma_{00} + u_{0j} + \gamma_{10}LandCover + u_{1j}LandCover + r_{ij} $$

Now, there is a random effect allowing variation in mean Activity and another that allows the linear relationship between LandCoverType and Activity to vary across sites. If there is is only one LandCoverType per site, $\text{Var}(u_{1j}) = 0$.

The correct specification 1|Site corresponds to the "Means as Outcomes" model described by Raudenbush & Bryk (2002).

$\endgroup$
0
$\begingroup$

If you initially set out to test the hypothesis that LandCoverType has an effect on Activity, then the first model you specified ( Activity ~ LandCoverType, random = 1|Site ) is the correct one.

Check the usual assumptions for regression - linearity, homogeneity of variance etc.
But seeing as you didn't set out to test for inter-site variation I wouldn't worry too much about it.

$\endgroup$
0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.