Linear Mixed Effect Model - random intercept and slope? Identifiability problems

Question

I have a question regarding model building for a large dataset including about 5000 Subjects. I want to fit a LMEM including multiple variables and I have repeated measurements in time. But for some of the subjects (around 1200, means <25%) I only have one measurement. This was no problem when fitting a simple LMEM just including a random intercept as the dataset is large enough. However, I ended up in identifiability problems and non-convergence when adding a random slope to the model. So im wondering what's more common: Removing the subjects only providing one measurment and estimating a model with random intercept and slope or keeping the total data set as it is and just using a random intercept.

Actually the results concerning the fixed effects are quite similar but I want to go the correct and more-standard way. I am really wondering how to decide whether to use only random intercept or random intercept and slope.

Thanks a lot!

Robert Long · Accepted Answer · 2020-07-02 19:11:57Z

9

First, I would almost always advise against deleting observations for any reason, but in your case I definitely advise against it. By deleting observations you lose statistical power but more importantly you can introduce bias.

Think for a moment about what it means to fit random slopes. It means that you allow the slope for a fixed effect to vary by subject. In other words each subject gets it's own slope for that variable. So in the case where a subject has only one observation, what slope could it have ? To make sense of fitting a slope you would ideally have at least 2 observations. Mixed Models are robust to small cluster sizes, but when you have a large proportion of singleton clusters it doesn't make sense to fit random slopes.

answered Jul 2, 2020 at 19:11

Robert Long

68.5k11 gold badges145 silver badges270 bronze badges

1

$\begingroup$ Thanks, Robert. Yes that's exactly what I thought that fitting a slope with just one observation doesn't make sense, of course. Thats why I thought I should remove these subjects from my dataset, but your answer convinced me that I should stick to the full data set and just fit random intercepts. $\endgroup$

Kathrin
– Kathrin

2020-07-02 19:19:14 +00:00
Commented Jul 2, 2020 at 19:19
$\begingroup$ @Kathrin Glad to hear it. It's much more important to retain your data than delete observations in order to fit a more complex model. $\endgroup$

Robert Long
– Robert Long

2020-07-02 19:38:21 +00:00
Commented Jul 2, 2020 at 19:38
$\begingroup$ One more question: I just realised that I have more measurements- I removed the values at time zero because I added a variable for the baseline value. Is it allowed to use these measurements in zero although I use baseline as a covariate? in this case I wouldn't run into identifiability problems even when fitting a random slope. $\endgroup$

Kathrin
– Kathrin

2020-07-03 18:48:40 +00:00
Commented Jul 3, 2020 at 18:48
$\begingroup$ @Kathrin that sounds like a good idea to me. In addition to helping identify random slopes, note that regressing follow-up on baseline is often a very dubious thing to do when analysing change. $\endgroup$

Robert Long
– Robert Long

2020-07-04 19:08:02 +00:00
Commented Jul 4, 2020 at 19:08
$\begingroup$ Thank you once again! Sorry for all these questions but I'm really not sure... basically we are interested in the change from baseline to a later measurement so we also considered to model the change instead of the value itself. But I think I have to look for good sources providing more explanations on that. $\endgroup$

Kathrin
– Kathrin

2020-07-04 20:02:54 +00:00
Commented Jul 4, 2020 at 20:02

| Show 1 more comment

Stack Exchange Network

Linear Mixed Effect Model - random intercept and slope? Identifiability problems

1 Answer 1

Hot Network Questions

Linear Mixed Effect Model - random intercept and slope? Identifiability problems

1 Answer 1

Related

Hot Network Questions