3
$\begingroup$

If an interaction term in a regression model leads to a lower AIC (indicating a better model fit), but the p-value for the interaction term itself is not statistically significant, should the interaction still be retained in the model?

From one perspective, a lower AIC suggests the model with the interaction better explains the data. However, a non-significant interaction implies that there is no strong evidence that the effect of one variable on the outcome differs across levels of the other variable.

In this case, how should one balance the improvement in AIC against the lack of statistical significance? Does a non-significant interaction still have any practical or interpretive value, or is it better to exclude it to maintain a more parsimonious model?

$\endgroup$
2
  • 3
    $\begingroup$ Any kind of data-driven model choice, whether driven by p values or by AIC, will invalidate your subsequent p values unless corrected for (which is highly non-trivial). So the answer really is: whatever you do, do it on pilot data, decide on a model, then collect new data and apply your model to it without change. $\endgroup$ Commented Oct 11 at 11:48
  • $\begingroup$ I would plot the interaction, which can help determine its practical significance (as opposed to statistical significance, which, as @peter flom pointed out, is partly driven by sample size). $\endgroup$ Commented Oct 13 at 17:24

1 Answer 1

7
$\begingroup$

You wrote

However, a non-significant interaction implies that there is no strong evidence that the effect of one variable on the outcome differs across levels of the other variable.

This is not exactly correct. It says that if the interaction effect in the population was really 0, you would get an interaction in the sample that was as far from 0 as the one you got at least 5% of the time.

This is partly a function of sample size: The same size effect in a large sample will be significant, in a small one, not significant.

If you must use a purely statistical criterion, AIC is better. But I would compare the predicted values of the two models to the actual results and see what was going on. I'd do this with plots of the errors, maybe a QQ plot, maybe a Tukey mean difference plot.

$\endgroup$
1
  • 5
    $\begingroup$ And for those who are open to Bayes, the best solution is to elicit prior distributions on the importance of interaction effects and using those in the final analysis. Allows interactions to be “half in” and “half out” of the model. $\endgroup$ Commented Oct 11 at 11:58

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.