5
$\begingroup$

What does the word "omnibus" mean in the context of statistics and data science?

I hear about omnibus measures and omnibus tests.

$\endgroup$
3
  • 1
    $\begingroup$ Omnibus when it refers to tests means a test against all alternatives, see also en.wikipedia.org/wiki/Omnibus_test $\endgroup$ Commented Jul 7, 2017 at 16:29
  • $\begingroup$ Can you provide an example of where it's been used? $\endgroup$ Commented Jul 7, 2017 at 16:31
  • 1
    $\begingroup$ @JohnK nice link. I find it odd, though, for such a long article to seem to omit discussion of goodness of fit tests, which I would have thought would be the most common place to see the term. For example when I search "omnibus test of" and "omnibus test for", each turns up a different goodness of fit test as the first hit that's not that Wikipedia page. $\endgroup$ Commented Jul 7, 2017 at 18:05

1 Answer 1

2
$\begingroup$

In plain language, you can interpret it like an "overall test"—it is testing a number of things at once. The most frequent way it is used, in my area of statistics in the social sciences at least, is referring to testing an entire factor instead of levels within it. Consider the following data frame:

set.seed(1839) dat <- data.frame(x=rnorm(100), y=rnorm(100), z=factor(rep(c(letters[1:4]),25))) head(dat) x y z 1 1.0127014 -0.98199201 a 2 -0.6845605 0.37451740 b 3 0.3492607 -0.08189552 c 4 -1.6245010 -0.08237190 d 5 -0.5162476 1.14766587 a 6 -0.7025836 -0.67800240 b 

y is the dependent variable, x is a continuous independent variable, and z is a categorical independent variable with four factors (a, b, c, or d).

If we run the regression model we get:

mod1 <- lm(y~x+z, dat) summary(mod1) Call: lm(formula = y ~ x + z, data = dat) Residuals: Min 1Q Median 3Q Max -2.73332 -0.66347 0.03676 0.58965 2.25179 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.01422 0.19244 0.074 0.941 x 0.03245 0.10671 0.304 0.762 zb -0.15265 0.27293 -0.559 0.577 zc 0.22139 0.27229 0.813 0.418 zd -0.06219 0.27830 -0.223 0.824 Residual standard error: 0.962 on 95 degrees of freedom Multiple R-squared: 0.02297, Adjusted R-squared: -0.01817 F-statistic: 0.5583 on 4 and 95 DF, p-value: 0.6935 

Notice that the output is testing three specific contrasts at the end: a vs. b, a vs. c, and a vs. d. What if we want to know if the variable z overall contributes any explanatory power to predicting y? We can do an omnibus test that tests ALL of the levels to see if there is a significant difference in there at least once. We could do this by comparing a model with z in it to one without z in it:

mod1 <- lm(y~x+z, dat) mod2 <- lm(y~x, dat) anova(mod2, mod1) Analysis of Variance Table Model 1: y ~ x Model 2: y ~ x + z Res.Df RSS Df Sum of Sq F Pr(>F) 1 98 89.802 2 95 87.912 3 1.8899 0.6808 0.566 

This is an omnibus test: It is not looking at one specific comparison, but seeing if the whole factor z (i.e., all of it; omnibus derives from the Latin word "for all") is significant.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.