Why are estimated marginal means from a linear mixed / multilevel model and Mixed ANOVA different?

Question

I have trouble understanding the causes of differences in estimated marginal means derived from a multilevel model and from a mixed ANOVA, in particular in a case with only two time points and a categorical between-person predictor.

I always thought that in case of only two time points, Mixed ANOVA is very close to a corresponding multilevel model. However, I have a dataset in which the two procedures produce rather different estimated marginal means in a classic intervention study with two groups and two time-points. I'd like to understand why. I know that multilevel models also utilize data points of those individuals who only participated at time 1, but is that really enough to produce such clear differences in EMMs? And is one set of EMMs more reliable, which should I report?

And what should I think of the fact that the Mixed Anova -based EMMs are closer to sample raw means at T2, but LMM-based EMMs are closer to raw means at T1?

The data:

data_sample<- structure(list(id = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 11L, 11L, 12L, 12L, 13L, 13L, 14L, 14L, 15L, 15L, 16L, 16L, 17L, 17L, 18L, 18L, 19L, 19L, 20L, 20L, 21L, 21L, 22L, 22L, 23L, 23L, 24L, 24L, 25L, 25L, 26L, 26L, 27L, 27L, 28L, 28L, 29L, 29L, 30L, 30L, 31L, 31L, 32L, 32L, 33L, 33L, 34L, 34L, 35L, 35L, 36L, 36L, 37L, 37L, 38L, 38L, 39L, 39L, 40L, 40L, 41L, 41L, 42L, 42L, 43L, 43L, 44L, 44L, 45L, 45L, 46L, 46L, 47L, 47L, 48L, 48L, 49L, 49L, 50L, 50L, 51L, 51L, 52L, 52L, 53L, 53L, 54L, 54L, 55L, 55L, 56L, 56L, 57L, 57L, 58L, 58L, 59L, 59L, 60L, 60L, 61L, 61L, 62L, 62L, 63L, 63L, 64L, 64L, 65L, 65L, 66L, 66L, 67L, 67L, 68L, 68L, 69L, 69L, 70L, 70L, 71L, 71L, 72L, 72L, 73L, 73L, 74L, 74L, 75L, 75L, 76L, 76L, 77L, 77L, 78L, 78L, 79L, 79L, 80L, 80L, 81L, 81L, 82L, 82L, 83L, 83L, 84L, 84L, 85L, 85L, 86L, 86L, 87L, 87L, 88L, 88L, 89L, 89L, 90L, 90L, 91L, 91L, 92L, 92L, 93L, 93L, 94L, 94L, 95L, 95L, 96L, 96L, 97L, 97L, 98L, 98L, 99L, 99L, 100L, 100L, 101L, 101L, 102L, 102L), group = c("I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C"), time = c("T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2", "T1", "T2"), value = c(3.5, NaN, 1.66666666666667, NaN, 3.5, 4, 3.33333333333333, 3, 2.66666666666667, 3, 1.5, NaN, 5.16666666666667, 4.33333333333333, 1.5, 2.33333333333333, 4.33333333333333, NaN, 3.5, 3, 4.33333333333333, 4.33333333333333, 2.83333333333333, 4.33333333333333, 1.5, NaN, 2.5, 2.16666666666667, 4.66666666666667, 4, 2.5, 2.5, 2, NaN, 1.83333333333333, NaN, 1.83333333333333, 3.16666666666667, 2.66666666666667, NaN, 2.83333333333333, 2.5, 2.33333333333333, 4.66666666666667, 4, 2.16666666666667, 2.66666666666667, 2.16666666666667, 3.66666666666667, 4, 3.33333333333333, 3.66666666666667, 3.16666666666667, NaN, 4.16666666666667, 4.16666666666667, 3.83333333333333, 5.16666666666667, 3, NaN, 1.33333333333333, NaN, 4.33333333333333, NaN, 3.5, 3.33333333333333, 3.16666666666667, 2.83333333333333, 3.16666666666667, 4.16666666666667, 2.16666666666667, 3.83333333333333, 2, NaN, 3.16666666666667, 3.66666666666667, 1.83333333333333, NaN, 1.83333333333333, NaN, 2, NaN, 3.5, NaN, 3.5, NaN, 5.66666666666667, NaN, 3.5, 2.33333333333333, 4, 4, 3.83333333333333, 3.33333333333333, 4.83333333333333, 5.66666666666667, 3.33333333333333, 3.16666666666667, 4.33333333333333, NaN, 2.5, NaN, 5.83333333333333, NaN, 3.33333333333333, 3.83333333333333, 3.16666666666667, 2, 3.5, 3.66666666666667, 3.5, 4.66666666666667, 1, 1.83333333333333, 3.5, 3.5, 2.66666666666667, 2.83333333333333, 2.33333333333333, 2.5, 3.16666666666667, NaN, 1.33333333333333, NaN, 4.16666666666667, NaN, 2.83333333333333, 3.83333333333333, 3.16666666666667, 3.16666666666667, 2.83333333333333, NaN, 3.16666666666667, 4, 2.33333333333333, 1.83333333333333, 4.33333333333333, NaN, 2.5, 2.66666666666667, 2.66666666666667, 2.16666666666667, 2.5, 2, 2.16666666666667, NaN, 2.5, 2.83333333333333, 5, NaN, 5.16666666666667, NaN, 5.16666666666667, NaN, 1.83333333333333, NaN, 2.66666666666667, 3.33333333333333, 4.5, 3.16666666666667, 4.16666666666667, 4.66666666666667, 3.33333333333333, 2.66666666666667, 4.33333333333333, 4.33333333333333, 3.16666666666667, 2.5, 3, NaN, 3, NaN, 2.16666666666667, 2.33333333333333, 3.5, 2.33333333333333, 2.83333333333333, 2, 3.33333333333333, NaN, 2.66666666666667, NaN, 2, 4.33333333333333, 4.83333333333333, NaN, 6, NaN, 3.33333333333333, 4.16666666666667, 3.5, 3.33333333333333, 5.33333333333333, NaN, 2.33333333333333, 1.66666666666667, 2.5, 4.16666666666667, 3, 3.66666666666667, 4.33333333333333, NaN, 2.16666666666667, 2.5)), row.names = c(NA, -204L), class = c("tbl_df", "tbl", "data.frame"))

Raw means:

library(Rmisc) means<-summarySE(data=data_sample, measurevar = "value", groupvars = c("group", "time"), na.rm=T) means group time N value sd se ci 1 C T1 49 3.234694 1.066601 0.1523715 0.3063635 2 C T2 31 3.053763 0.912249 0.1638448 0.3346157 3 I T1 53 3.150943 1.086664 0.1492647 0.2995216 4 I T2 31 3.510753 0.905677 0.1626644 0.3322050

Run the multilevel model

library(lme4) library(emmeans) model_lmm<-lmer(value ~ (1|id) + group*time, data=data_sample) em_lmm<-emmeans(model_lmm, specs = ~time*group) em_lmm time group emmean SE df lower.CL upper.CL T1 C 3.23 0.151 126 2.94 3.53 T2 C 3.25 0.176 156 2.90 3.59 T1 I 3.15 0.145 126 2.86 3.44 T2 I 3.40 0.174 158 3.05 3.74

Run Mixed Anova

library(afex) model_mixed <- aov_car(value ~ (time*group) + Error(id/time), data = data_sample) em_mixed<-emmeans(model_mixed, specs = ~time*group) em_mixed time group emmean SE df lower.CL upper.CL T1 C 2.93 0.142 60 2.65 3.21 T2 C 3.05 0.163 60 2.73 3.38 T1 I 3.33 0.142 60 3.04 3.61 T2 I 3.51 0.163 60 3.18 3.84

You've not shown the warning that is output by the repeated measures ANOVA, whereas that should provide a hint to how these methods differ... in particular, how do each of these handle incomplete data? — PBulls
– PBulls, Commented Nov 18 at 15:57

Erik Ruzek · Accepted Answer · 2025-11-18 16:00:10Z

The problem is due to how aov_car() handles missing data vs. how lmer() handles this problem. The first inclination that something was different was a warning that you get after running the aov_car() model code:

Warning message: Missing values for 40 ID(s), which were removed before analysis: 1, 2, 6, 9, 13, 17, 18, 20, 27, 30, ... [showing first 10 only] Below the first few rows (in wide format) of the removed cases with missing data. id group T1 T2 # 1 1 I 3.500000 NaN # 2 2 I 1.666667 NaN # 6 6 I 1.500000 NaN # 9 9 I 4.333333 NaN # 13 13 I 1.500000 NaN # 17 17 I 2.000000 NaN

Accordingly, aov_car() drops 40 individuals from the data prior to running the ANOVA. I am not sure if one can override this behavior, but it is different to how mixed models fit by maximum likelihood handle the problem of missing data. In lmer() and other similar mixed model routines, missing observation level data is dropped but the group (individual) is retained if they have at least one non-missing observation. This is due to the fact that maximum likelihood assumes a missing at random (MAR) missing mechanism for the outcome variable.

If we remove those individuals from the data with one of their value scores missing, we can recover the marginal means to match those produced from aov_car().

library(dplyr) data_sample <- data_sample |> group_by(id) |> mutate(miss_val = any(is.na(value))) |> ungroup() # Mixed model excluding IDs with any missing outcome data model_lmm_exclude<-lmer(value ~ (1|id) + group*time, data=data_sample, subset = miss_val==FALSE) summary(model_lmm_exclude) em_lmm_exclude<-emmeans(model_lmm_exclude, specs = ~time*group) em_lmm_exclude

This produces the following output,

 time group emmean SE df lower.CL upper.CL T1 C 2.93 0.153 96.9 2.63 3.23 T2 C 3.05 0.153 96.9 2.75 3.36 T1 I 3.33 0.153 96.9 3.02 3.63 T2 I 3.51 0.153 96.9 3.21 3.81

which produces the same marginal means but have different SEs and dfs according to the method used (emmeans() uses Kenward-Roger SEs and dfs when operating on a lmer() model).

> em_mixed time group emmean SE df lower.CL upper.CL T1 C 2.93 0.142 60 2.65 3.21 T2 C 3.05 0.163 60 2.73 3.38 T1 I 3.33 0.142 60 3.04 3.61 T2 I 3.51 0.163 60 3.18 3.84

Thank you very much, I feel silly now, I should have tried the lmm without the missings! — Sointu
– Sointu, Commented Nov 19 at 7:34
Happy to help, @Sointu! And thanks for asking - your question will likely be as helpful to folks as my answer. — Erik Ruzek
– Erik Ruzek, Commented Nov 19 at 14:25

Stack Exchange Network

Why are estimated marginal means from a linear mixed / multilevel model and Mixed ANOVA different?

1 Answer 1

Hot Network Questions

Why are estimated marginal means from a linear mixed / multilevel model and Mixed ANOVA different?

1 Answer 1

Related

Hot Network Questions