3
$\begingroup$

I'm trying to replicate the output of mg4.math.nlme model fit via nlme() function, using the lme() function in mg5.math.lme model.

I'm wondering how I should define the fixed and random part of my lme() call to replicate the nlme() results?

ps. essentially, we want to do a multiple-group analysis using nb_wght which is a binary variable (like gender). In other words, we want means, covariances, and residual variances of the model all be specific for each value of nb_wght (i.e., the binary group).

library(nlme) lmectl <- lmeControl(maxIter = 200, msMaxIter = 200, niterEM = 50, msMaxEval = 400) math <- read.csv('https://raw.githubusercontent.com/rnorouzian/v/main/mlgrp1.csv') ##========================== `nlme()` model: ========================== mg4.math.nlme <- nlme(math ~ nb_wght * ( (beta_1_N + d_1i_N) + (beta_2_N + d_2i_N) * (grade) ) + lb_wght * ( (beta_1_L + d_1i_L) + (beta_2_L + d_2i_L) * (grade) ), data = math, fixed = beta_1_N + beta_2_N + beta_1_L + beta_2_L ~ 1, random = pdBlocked(list(d_1i_N + d_2i_N ~ 1, d_1i_L + d_2i_L~1)), group =~ id, start = c(35, 4, 35, 4), weights = varIdent(form =~ 1|factor(lb_wght)), control = (list(returnObject = TRUE))) (summary(mg4.math.nlme)) # Value Std.Error DF t-value p-value #beta_1_N 35.481308 0.36474271 931 97.27764 0.000000e+00 #beta_2_N 4.297310 0.09017385 1287 47.65583 1.782340e-286 #beta_1_L 32.797810 1.40625509 1287 23.32280 1.232624e-100 #beta_2_L 4.878055 0.33822372 1287 14.42257 8.007625e-44 ##========================== `lme()` equivalent model: ========================== mg5.math.lme <- lme(math ~ nb_wght * grade, data = math, random = list(id = pdBlocked(list( ~grade, ~0 + grade:nb_wght))), weights = varIdent(form =~ 1 | factor(lb_wght)), control = lmectl) (summary(mg5.math.lme)) # Value Std.Error DF t-value p-value #(Intercept) 32.6765937 1.3319270 1287 24.533322 2.346406e-109 #nb_wght 2.8052003 1.3815070 930 2.030536 4.258615e-02 #grade 4.9372007 0.3543493 1287 13.933146 3.332060e-41 #nb_wght:grade -0.6402842 0.3656091 1287 -1.751281 8.013572e-02 
$\endgroup$
5
  • $\begingroup$ I don't know enough about nlme to answer, but I reformatted the code to make it easier (for me) to read, and added the AIC, which is very similar for the two models - not sure if that means something. :) $\endgroup$ Commented Feb 4, 2021 at 4:48
  • $\begingroup$ This feels like a homework question and looking at the repo with the data, it seems like a homework question. It should be tagged self-study. $\endgroup$ Commented Feb 5, 2021 at 1:43
  • $\begingroup$ Even if this isn't homework, I still would like to see some explanation about how and why you converted from nlme to lme. How: what steps did you take to reformulate the problem. Why: why do you need to rewrite an nlme model as an lme model? Without some motivation, this seems like an XY Problem. If the motivation is understanding the conversion, then some of the derivation in your proposed conversion needs to be shown. $\endgroup$ Commented Feb 5, 2021 at 15:15
  • $\begingroup$ Related to my previous comment on the XY Problem: what type of equivalence are you going for? Equivalent predictions? Then test the fitted values and predicted values on a hold-out set. Identical model matrices? Test those directly, although floating point error and differences in estimation between the functions may lead to some differences. One-to-one correspondence between coefficients in two different ways of expressing a model formula? Convert the first formula to math notation, do some algebra to get it into a form that lends itself to conversion to Wilkinson notation for linear models. $\endgroup$ Commented Feb 5, 2021 at 15:22
  • $\begingroup$ In all these cases, it would also be helpful to know something about the structure of the data -- what are the properties of each of the data variables? Are some of them bounded? Are some categorical? Etc. $\endgroup$ Commented Feb 5, 2021 at 15:24

1 Answer 1

3
$\begingroup$

Since this question is about re-writing a model specification, I'll focus on that aspect. Without knowing more about the data or precise inferential goal, I can't comment on whether either model is the right tool. I'm also unsure why the OP needs to rewrite a functioning model to use a different function within the same software package.

Let's start from the OP's setup.

library(nlme) math <- read.csv('https://raw.githubusercontent.com/rnorouzian/v/main/mlgrp1.csv') 

Now I'm going to take the OP's nonlinear model specification, remove some extra parentheses and use some whitespace for readability.

nonlinear <- nlme( math ~ nb_wght * ( (beta_1_N + d_1i_N) + (beta_2_N + d_2i_N) * grade) + lb_wght * ( (beta_1_L + d_1i_L) + (beta_2_L + d_2i_L) * grade), data = math, fixed = beta_1_N + beta_2_N + beta_1_L + beta_2_L ~ 1, random = pdBlocked(list(d_1i_N + d_2i_N ~ 1, d_1i_L + d_2i_L ~ 1)), group = ~ id, start = c(35, 4, 35, 4), weights = varIdent(form =~ 1|factor(lb_wght)), control = list(returnObject = TRUE)) summary(nonlinear) 

Next, I'll rename variables to more meaningful things so that I can better understand what's going on.

nonlinear <- nlme( math ~ nb_wght * ( (beta_nb_wght + blup_nb_wght) + (beta_nb_wght_grade + blup_nb_wght_grade) * grade) + lb_wght * ( (beta_lb_wght + blup_lb_wght) + (beta_lb_wght_grade + blup_lb_wght_grade) * grade), data = math, fixed = beta_nb_wght + beta_nb_wght_grade + beta_lb_wght + beta_lb_wght_grade ~ 1, random = pdBlocked(list(blup_nb_wght + blup_nb_wght_grade ~ 1, blup_lb_wght + blup_lb_wght_grade ~ 1)), group = ~ id, start = c(35, 4, 35, 4), weights = varIdent(form =~ 1|factor(lb_wght)), control = list(returnObject = TRUE)) summary(nonlinear) 

Okay, now I can see how the pieces fit together a bit better. Time to re-write it as something closer to a classical linear model by re-ordering terms.

nonlinear <- nlme( math ~ (beta_nb_wght + blup_nb_wght) * nb_wght + (beta_nb_wght_grade + blup_nb_wght_grade) * I(grade*nb_wght) + (beta_lb_wght + blup_lb_wght) * lb_wght + (beta_lb_wght_grade + blup_lb_wght_grade) * I(grade*lb_wght), data = math, # sort the coefficients in order of interaction-level: # - this will make comparisons of the fixed easier later by matching sorting # - note that the predictor for a two-way interaction of continuous predictors # is just the (element-wise) product of the two predictors fixed = beta_nb_wght + beta_lb_wght + beta_nb_wght_grade + beta_lb_wght_grade ~ 1, # we use the list syntax for `random` instead of depending # on the separate group argument, again bringing us closer to the usual way of # specifying the linear model random = list(id = pdBlocked( list(blup_nb_wght + blup_nb_wght_grade ~ 1, blup_lb_wght + blup_lb_wght_grade ~ 1))), start = c(35, 4, 35, 4), weights = varIdent(form =~ 1|factor(lb_wght)), control = list(returnObject = TRUE)) summary(nonlinear) 

Using the linear-model formula syntax, we now have fixed effects

math ~ 0 + nb_wght + lb_wght + grade:nb_wght + grade:lb_wght # Note the lack of intercept. 

and random effects:

list(~ 0 + nb_wght + grade:nb_wght, ~ 0 + lb_wght + grade:lb_wght) 

The big change in both cases is to make the betas/BLUPs variables implicit and only specify the data columns.

That brings us to the linear model:

linear <- lme( math ~ 0 + nb_wght + lb_wght + grade:nb_wght + grade:lb_wght, data = math, random = list(id = pdBlocked( list(~ 0 + nb_wght + grade:nb_wght, ~ 0 + lb_wght + grade:lb_wght))), weights = varIdent(form =~ 1|factor(lb_wght)), method = "ML", # default for nlme, but lme and lmer both default to REML control = list(returnObject = TRUE)) 

Let's compare them now.

They make make similar predictions:

> all.equal(fitted(linear), fitted(nonlinear)) [1] TRUE 

The estimates of the fixed-effects are close (this comparison is why were-ordered the coefficients in the nonlinear model):

> all.equal(unname(fixed.effects(linear)), + unname(fixed.effects(nonlinear))) [1] TRUE 

as are the variance parameters:

> all.equal(unname(as.numeric(VarCorr(linear)[,"StdDev"])), + unname(as.numeric(VarCorr(nonlinear)[,"StdDev"]))) [1] TRUE 

So these models are identical, and their log likelihoods reflect that as well:

> logLik(nonlinear) 'log Lik.' -7963.061 (df=12) > logLik(linear) 'log Lik.' -7963.061 (df=12) 

In other cases, you might find that one or the other method converges to a slightly better optimum (=better log likelihood).

If you look at the full model summaries, you'll see that they're the same model:

Non linear

> summary(nonlinear) Nonlinear mixed-effects model fit by maximum likelihood Model: math ~ (beta_nb_wght + blup_nb_wght) * nb_wght + (beta_nb_wght_grade + blup_nb_wght_grade) * I(grade * nb_wght) + (beta_lb_wght + blup_lb_wght) * lb_wght + (beta_lb_wght_grade + blup_lb_wght_grade) * I(grade * lb_wght) Data: math AIC BIC logLik 15950.12 16018.59 -7963.061 Random effects: Composite Structure: Blocked Block 1: blup_nb_wght, blup_nb_wght_grade Formula: list(blup_nb_wght ~ 1, blup_nb_wght_grade ~ 1) Level: id Structure: General positive-definite StdDev Corr blup_nb_wght 7.8921980 blp_n_ blup_nb_wght_grade 0.8795756 -0.009 Block 2: blup_lb_wght, blup_lb_wght_grade Formula: list(blup_lb_wght ~ 1, blup_lb_wght_grade ~ 1) Level: id Structure: General positive-definite StdDev Corr blup_lb_wght 8.99415931 blp_l_ blup_lb_wght_grade 0.02853591 0.986 Residual 5.93062152 Variance function: Structure: Different standard deviations per stratum Formula: ~1 | factor(lb_wght) Parameter estimates: 0 1 1.000000 1.170217 Fixed effects: beta_nb_wght + beta_lb_wght + beta_nb_wght_grade + beta_lb_wght_grade ~ 1 Value Std.Error DF t-value p-value beta_nb_wght 35.48131 0.3647427 931 97.27765 0 beta_lb_wght 32.79780 1.4062538 1287 23.32282 0 beta_nb_wght_grade 4.29731 0.0901739 1287 47.65582 0 beta_lb_wght_grade 4.87806 0.3382250 1287 14.42252 0 Correlation: bt_nb_ bt_lb_ bt_n__ beta_lb_wght 0.000 beta_nb_wght_grade -0.527 0.000 beta_lb_wght_grade 0.000 -0.549 0.000 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.080667594 -0.533467953 -0.002345929 0.537432306 2.566342070 Number of Observations: 2221 Number of Groups: 932 

Linear

> summary(linear) Linear mixed-effects model fit by maximum likelihood Data: math AIC BIC logLik 15950.12 16018.59 -7963.061 Random effects: Composite Structure: Blocked Block 1: nb_wght, nb_wght:grade Formula: ~0 + nb_wght + grade:nb_wght | id Structure: General positive-definite StdDev Corr nb_wght 7.8921980 nb_wgh nb_wght:grade 0.8795756 -0.009 Block 2: lb_wght, lb_wght:grade Formula: ~0 + lb_wght + grade:lb_wght | id Structure: General positive-definite StdDev Corr lb_wght 8.99415931 lb_wgh lb_wght:grade 0.02853591 0.986 Residual 5.93062152 Variance function: Structure: Different standard deviations per stratum Formula: ~1 | factor(lb_wght) Parameter estimates: 0 1 1.000000 1.170217 Fixed effects: math ~ 0 + nb_wght + lb_wght + grade:nb_wght + grade:lb_wght Value Std.Error DF t-value p-value nb_wght 35.48131 0.3647427 930 97.27765 0 lb_wght 32.79780 1.4062538 930 23.32282 0 nb_wght:grade 4.29731 0.0901739 1288 47.65582 0 lb_wght:grade 4.87806 0.3382250 1288 14.42252 0 Correlation: nb_wgh lb_wgh nb_wg: lb_wght 0.000 nb_wght:grade -0.527 0.000 lb_wght:grade 0.000 -0.549 0.000 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.080667594 -0.533467953 -0.002345929 0.537432305 2.566342070 Number of Observations: 2221 Number of Groups: 932 
$\endgroup$
1
  • $\begingroup$ @rnorouzian This follow-up is exactly why I asked the questions in my comments to the original question. My answer provides the algebra and that will work no matter how many predictors you have. The other bit you need is an understanding of how categorical variables are expanded into a set of numeric contrasts, then it's just repeating that algebra with those numeric contrasts. If nb_wght is just 0's and 1's, it's already equivalent to a two-level categorical variable coded with dummy/treatment contrasts. $\endgroup$ Commented Feb 6, 2021 at 11:52

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.