Following the same m3 model approach proposed by Gavin Simpson on my data, I compared an ordered factor model (m3_of) to a non-ordered factor model (m3). The global shapes trend to be similar; however the m3_of plot displays a marked wigglyness for both groups.
Question 1: How to explain this difference of wigglyness between both models?
Question 2: Comparing both models using itsadug::compareML(m3, m3_of), the best model is m3 (lower AIC); however, m3_of has a slightly better deviance explained (9.67% vs 9.63%), knowing that such model advantageously allow a more interpretable comparison (groupof=1 compared to the reference groupof=0). Hence, what is the more appropriate model?
Thanks for any help, advice, reference.
Data:
n = 180,000 bmk: outcome, continuous, positive delay: predictor, continuous, positive group: factor (n=2) groupof: ordered factor (n=2, groupof=0 being the reference groupof) medu: random effect (n=91 different medical units) Models:
m3 <- bam(bmk ~ group + s(delay, k=20, by = group) + s(delay, k=20, medu, bs = "fs"), data = dat, method = 'fREML', family = inverse.gaussian(link="identity"), discrete = TRUE) m3_of <- bam(bmk ~ groupof + s(delay, k=20) + s(delay, k=20, by = groupof) + s(delay, k=20, medu, bs = "fs"), data = dat, method = 'fREML', family = inverse.gaussian(link="identity"), discrete = TRUE) Plots:
par(mfrow = c(1,2), cex = 1.1) plot_smooth(m3, view="delay", plot_all="group", rm.ranef=FALSE, n.grid = 50, col=c("blue","red"), xlim=c(0,90), ylim=c(11.5,14.5), main = "m3") plot_smooth(m3_of, view="delay", plot_all="groupof", rm.ranef=FALSE, n.grid = 50, col=c("blue","red"), xlim=c(0,90), ylim=c(11.5,14.5), main = "m3_of") gratia::appraise(m3) Summary(m3):
> summary(m3) Family: inverse.gaussian Link function: identity Formula: bmk ~ group + s(delay, k = 20, by = group) + s(delay, k = 20, medu, bs = "fs") Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 11.76764 0.19840 59.31 <2e-16 *** group1 0.32901 0.02879 11.43 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Approximate significance of smooth terms: edf Ref.df F p-value s(delay):group0 1.001 1.002 2.903 0.0882 . s(delay):group1 1.002 1.003 17.792 2.46e-05 *** s(delay,medu) 145.751 1626.000 10.319 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R-sq.(adj) = 0.0792 Deviance explained = 9.63% fREML = -1.8252e+05 Scale est. = 0.0089033 n = 179659 > gam.check(m3) k' edf k-index p-value s(delay):group0 19 1 0.98 0.54 s(delay):group1 19 1 0.98 0.48 s(delay,medu) 1700 146 0.98 0.48 Summary(m3_of):
> summary(m3_of) Family: inverse.gaussian Link function: identity Formula: bmk ~ groupof + s(delay, k = 20) + s(delay, k = 20, by = groupof) + s(delay, k = 20, medu, bs = "fs") Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 11.70146 0.14033 83.39 <2e-16 *** groupof1 0.33219 0.02974 11.17 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Approximate significance of smooth terms: edf Ref.df F p-value s(delay) 9.370 11.457 2.188 0.01100 * s(delay):groupof1 1.004 1.007 10.200 0.00134 ** s(delay,medu) 176.807 1626.000 10.313 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R-sq.(adj) = 0.0792 Deviance explained = 9.67% fREML = -1.8246e+05 Scale est. = 0.0089023 n = 179659 > gam.check(m3_of) k' edf k-index p-value s(delay) 19.00 9.37 0.98 0.69 s(delay):groupof1 19.00 1.00 0.98 0.68 s(delay,medu) 1700.00 176.81 0.98 0.68 Compare models:
itsadug::compareML(m3, m3_of) Model m3 preferred: lower fREML score (56.679), and equal df (0.000). ----- Model Score Edf Difference Df 1 m3_of -182464.7 9 2 m3 -182521.3 9 56.679 0.000 AIC difference: -60.87, model m3 has lower AIC. 

