1
$\begingroup$

I am working on some data that I would like to analyze through a generalized linear mixed model regression and a stepwise backward selection of variables directly on that model.

I use the GLMERSelect R package which seems developed exactly for this purpose. In this function, you specify the response variable (responseVar), model family (fitFamily, binomial in my case) random effect variables (randomStruct), fixedTerms (continous variables) and fixedFactors (categorical variables). You can find a nice usage examples here.

As you can see from the example, for the continuous variables you specify the polynomial order that you should use, in the example they use 2 fixedTerms = list(logHPD.rs=2,logDistRd.rs=2)

My theoretical question is: how do you choose this order for each continuous variable? Why and when would you use a quadratic (order=2) instead of a linear (order=1) or a higher order polynomial?

$\endgroup$
3
  • 1
    $\begingroup$ Isn’t the whole point of stepwise selection that you throw a bunch of variables at the problem and make the stepwise procedure pick some? While our Alexis has posted that stepwise regression in general is pants, if that’s what you’re going to do, what’s the problem with letting it do it’s thing and select the variables? $\endgroup$ Commented Dec 19, 2022 at 12:20
  • $\begingroup$ Are you proposing to use a single polynomial fit instead of a flexible cubic regression spline? I'm not familiar with that package's syntax, but if that's what you're doing you are likely to get led astray with a single polynomial fit. You can set up a regression spline and, for example, use elimination to choose the number of knots/flexibility of the continuous-variable models. $\endgroup$ Commented Dec 19, 2022 at 13:51
  • $\begingroup$ Hi @Dave and EdM and thanks for your reply! I was just wondering on what basis do you select the polynomial order to use for each continuous variable. Do you base this on the number of continuous variables you have in the model? On some specific characteristic of each variable (and in case which one), etc $\endgroup$ Commented Dec 19, 2022 at 14:13

1 Answer 1

1
$\begingroup$

A single polynomial fitted to data in regression can be a recipe for distaster unless you have a strong theoretical justification for a particular polynomial form. This page among others explains many reasons why.

A better general approach is to use a smooth continuous model of a continuous predictor, a model doesn't impose a single functional form over its entire range. This page outlines different ways to accomplish this.

For smooths in generalized additive models or smoothing splines, you can adjust the magnitude of penalization to best fit your data. Regression splines specify a number of "knots" spaced along the continuous values, fitting a cubic polynomial individually between each pair of knots in a way that joins the individual fits smoothly at each knot. Increasing or decreasing the number of knots adds or subtracts from the number of coefficients to estimate. With regression splines you can thus choose the number of knots, perhaps via backward elimination of the type you propose, to adjust the complexity of the fit to the data at hand.

$\endgroup$
2
  • $\begingroup$ Hi @edm and thanks for your answer. I am having hard times in understanding how I could use smoothing functions instead/before this polynomial order selection in backward. Could you provide some simple use? $\endgroup$ Commented Dec 20, 2022 at 10:32
  • 1
    $\begingroup$ @cccnrc this is simplest in principle with knots in regression splines. You can use the AIC (a typical measure of fit in stepwise procedures) to choose the number of knots; see for example Section 2.4.6 of Frank Harrell's course notes. The bigger problem is using backward elimination in the first place, at least in an early stage of model building. See Section 4 of those notes; start with Section 4.3 on Variable Selection and the Summary in Section 4.12. Also see this page. $\endgroup$ Commented Dec 20, 2022 at 15:12

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.