What is the proper ratio of mean squares for a two-level nested ANOVA?

Question

I would like to analyse the following type of experiment using a two-level fully nested ANOVA. We have mice of two genotypes (Group factor) and for each of the genotypes we take a certain number of mice (sub-groups) and measure several identical samples per each mouse. Thus it looks like a classic two-level nested design with Genotype and Mice as fixed factors. I am interested in the main group effect of the genotype on the measurement outcome. I used Statistica where F-value is calculated by the ratio of $\text{MS}_\text{group}/\text{MS}_\text{residual}$. However, I found several references on the web suggesting that it is better to calculate F-value from the ratio of $\text{MS}_\text{group}/\text{MS}_\text{sub-group}$. For example, on John McDonald's web page:

In a two-level nested anova, there are two F statistics, one for subgroups (Fsubgroup) and one for groups (Fgroup). The subgroup F-statistic is found by dividing the among-subgroup mean square, MSsubgroup (the average variance of subgroup means within each group) by the within-subgroup mean square, MSwithin (the average variation among individual measurements within each subgroup). The group F-statistic is found by dividing the among-group mean square, MSgroup (the variation among group means) by MSsubgroup. The P-value is then calculated for the F-statistic at each level...

On the same web page, he comments that in Rweb:

Fgroup is calculated as MSgroup/MSwithin, which is not a good idea if Fsubgroup is significant.

In my analysis, if there was a significant Group (Genotype) effect calculated as $\text{MS}_\text{group}/\text{MS}_\text{residual}$, I didn't observe a significant effect of sub-groups (Mice). Does it mean that I can use $\text{MS}_\text{group}/\text{MS}_\text{residual}$ ratio for estimating the F-value?

Freya Harrison · Accepted Answer · 2011-06-22 08:48:31Z

Before I answer: whether you have heard of nested ANOVA, or even think of designs as 'nested,' depends to some extent on what software you use and how you were taught stats. There's more than one way to skin a cat, as they say.

If your design includes nested factors, then the model you use to analyse the data using ANOVA should reflect this nested design. In your design, mouse is nested within genotype, as each mouse can be of only one genotype. The individual samples (which are effectively nested within mouse) represent the bottom layer in this design and are effectively your 'error' term. It is important to note that the group of samples as a whole is not a set of independent data points because they are grouped by mouse.

To test for an effect of genotype on your measured variable of interest, you should calculate an F-ratio that corresponds to (variance between genotypes / variance within genotypes).

If you fit the model

measurement = genotype

or the model

measurement = mouse + genotype

Then the F-ratio for genotype will be calculated as MS(genotype)/MS(error). This is incorrect, because it does not accurately reflect (variance between genotypes / variance within genotypes). The variance within a genotype is due to variance among the mice in each genotype, not among all the individual samples. If you fit the model

measurement = genotype + mouse(genotype)

And (this is important) also declare mouse as a random factor, then the F-ratio for genotype will be calculated as MS(genotype)/MS(mouse), which is correct.

So if you fit the incorrect model that ignores the nesting, you use the wrong denominator term for your F-ratio - which will usually be artifically large because you have used a denominator that has too many DF (you have more samples than mice) and therefore your chances of making a Type 1 error are increased.

You do also have an alternative to the nested model, which is to calculate the mean measurement for each mouse, rendering one datapoint per mouse. You can then fit the model measurement = genotype because you have now made MS(error) the correct denominator for MS(genotype). The results will be identical to the nested analysis of the original data.

Nested ANOVA is useful for telling you where most variation lies in your design - i.e. which level of the nesting is most variable. It is also the correct type of ANOVA to use when your experiemtal design is intrinsically nested.

Nesting is very nicely covered in Chapter 12 of Grafen & Hails's Modern Statistics for the Life Sciences (Oxford University Press), I highly recommend reading over this.

Stack Exchange Network

What is the proper ratio of mean squares for a two-level nested ANOVA?

1 Answer 1

Hot Network Questions

What is the proper ratio of mean squares for a two-level nested ANOVA?

1 Answer 1

Related

Hot Network Questions