I have a 3-level repeated measures model setup observations (level-1) nested within patients (level-2) nested within providers (level-3). One of my questions is the percentage of variance explained at level-2 versus level-3 across the dependent variable (outcome). I'm running these with the intra-class correlation coefficient ICC (p).
I am trying to fit a model where I partition variance within my level-2 and level-3 random effects based on client's racial status (white/non-white). I can subset my data and run three separate models and measure the ICCs individually, but I am hoping to do this in one model, which will also allow me to test significance between variance estimates by leaving one out and running log-likelihood tests, e.g. anova(model1,model2).
Here is my baseline (random intercept) model for calculating ICC on the complete sample: (note there is no cross-nesting and each patient has their own provider, no two providers see the same patient)
model1 <- lmer(outcome ~ (1|id) + (1|provider), data = dat)
Rather than subset the data and run three models. I want to run one model and measure both client and therapist variability in the outcome measure with white versus non-white patients. I ran this model:
model2 <- lmer(outcome ~ rem + (1|id/rem) + (1|provider/rem), data = dat)
Note: REM (racial-ethnic minority) is coded as a factor, "White" and "REM". I have about 1000 White patients, 500 REM patients, and 39 providers (each provider has seen at least 5 patients).
My understanding is that without the (1|...) in both random effects the model will estimate covariances, which I do not want as I will not be able to estimate ICCs.
Model 2 output is as follows:
REML criterion at convergence: 13694.2 Scaled residuals: Min 1Q Median 3Q Max -10.2143 -0.3806 0.0751 0.4585 5.6865 Random effects: Groups Name Variance Std.Dev. rem:id (Intercept) 0.21617 0.46494 id (Intercept) 0.36515 0.60428 rem:provider (Intercept) 0.05808 0.24100 provider (Intercept) 0.00515 0.07176 Residual 0.17966 0.42386 Number of obs: 8606, groups: rem:id, 1504; id, 1504; rem:provider, 78; provider, 39 Fixed effects: Estimate Std. Error df t value Pr(>|t|) (Intercept) 5.9928132 0.0483799 66.1700000 123.870 <2e-16 *** remrem -0.0005091 0.0740942 52.5400000 -0.007 0.995 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) remrem -0.615 I remain with a few questions:
- How do I interpret the Groups under Random effects? e.g.,
rem:idversusid... It looks like there are two intercepts for my level-1 variables, but I do not know ifrem:idis across rem status andidis across all patients. If that is the case I do not believe this model is right for my question. I am looking for something likewhite:idandrem:idand maybe that is not possible with this model. - The number of observations looks mostly right, however, for
rem:providerthe number is double the number of providers in my sample. Andrem:idis equal to total number of patients as well asid. In my sample White is about 1000 and REM is about 500. Is this indicative of an error in my model, or is it not assessing these two groups separately? - My residual variance across both models looks to be forced into one variance estimate. Is there a way to partition this variance across level-1 and level-2? This will be critical in calculating ICCs.
- Slight aside: but do I need to include
remas a fixed effect as well? How does including versus not including impact variance estimation?
In the end, I am hoping to be able to calculate variance explained across all 3 levels (patient, provider residual) by 3 samples (total sample, white, rem) for a total of 9 ICCs.
The best response I have found in answering how to specify my random effects is Ben Boker's response here but I am still having troubles interpreting model results and pulling variance estimates to calculate ICCs.