21
$\begingroup$

Is the variance of the mean of a set of possibly dependent random variables less than or equal to the average of their respective variances?

Mathematically, given random variables $X_1, X_2, ..., X_n$ that may be dependent:

Let $\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i$ be the mean of these random variables.

Is it true that:

$$\text{Var}(\bar{X}) \leq \frac{1}{n}\sum_{i=1}^n \text{Var}(X_i)$$

I know that for independent random variables, we have the following equality:

$$\text{Var}(\bar{X}) = \frac{1}{n^2}\sum_{i=1}^n \text{Var}(X_i)$$

Which clearly satisfies the inequality. However, I'm unsure if this holds for dependent variables.

If this inequality is true, is there a proof or intuitive explanation?

If it's not always true, are there conditions under which it holds? What about the following inequality? $$\text{Var}(\bar{X}) \leq \text{Max}_{i=1}^n \text{Var}(X_i)$$

Any insights, proofs, or counterexamples would be greatly appreciated. Thank you!

$\endgroup$

4 Answers 4

24
$\begingroup$

Yes, it is true. Here is a proof. $$ \begin{align} \newcommand{\Var}{\operatorname{Var}} &\Var(\overline{X}) \\ &= \frac1{n^2}\Var\left(\sum_{i=1}^n X_i\right) \\ &=\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\text{Cov}(X_i,X_j) \\ &\le\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\sqrt{\text{Var}(X_i)\cdot \Var(X_j)} \\ &\le\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\frac{\Var X_i+\Var X_j}{2} \\ &= \frac1n\sum_{i=1}^n \Var(X_i). \end{align} $$

$\endgroup$
15
$\begingroup$

In general, one has : $$ \begin{align} \operatorname{Var}\left(\sum_{k=0}^n X_k\right) &= \sum_{i,j=0}^n \operatorname{Cov}(X_i,X_j) \end{align} $$ Now, the well-known inequality $ab \le \frac{1}{2}(a^2+b^2)$ permits to write : $$ \begin{align} \operatorname{Cov}(X,Y) &= \Bbb{E}\left[(X-\Bbb{E}[X])(Y-\Bbb{E}[Y])\right] \\ &\le \frac{1}{2}\Bbb{E}\left[(X-\Bbb{E}[X])^2 + (Y-\Bbb{E}[Y])^2\right] \\ &= \frac{1}{2} \left(\operatorname{Var}(X) + \operatorname{Var}(Y)\right) \end{align} $$ Hence $$ \operatorname{Var}\left(\sum_{k=0}^n X_k\right) \le \frac{1}{2} \sum_{i,j=0}^n \left(\operatorname{Var}(X_i) + \operatorname{Var}(X_j)\right) = n \sum_{k=0}^n \operatorname{Var}(X_k) $$ and finally $$ \operatorname{Var}\left(\bar{X}\right) = \frac{1}{n^2} \operatorname{Var}\left(\sum_{k=0}^n X_k\right) \le \frac{1}{n} \sum_{k=0}^n \operatorname{Var}(X_k) $$

$\endgroup$
14
$\begingroup$

A way to see this at a glance is that real random variables form an inner product space, with $\langle X,Y \rangle = \mathbb{E}XY$. The norm induced by this inner product is $\|X\|^2=\mathbb{E}X^2$, and both $\|X\|$ and $\|X\|^2$ are always convex for an inner product space.

Furthermore $\mathbb{E}X$ is linear, so $f(X)=X-\mathbb{E}X$ is linear, and a convex function composed with a linear transformation is always still convex.

This gives us that $$Var(X)=\langle X-\mathbb{E}X , X-\mathbb{E}X \rangle$$ is convex, from which your conjecture immediately follows.

$\endgroup$
1
  • $\begingroup$ I.e., Jensen's Inequality. $\endgroup$ Commented Aug 10, 2024 at 14:58
4
$\begingroup$

$$\text{Var}\left(\sum X_i\right) = \sum\limits_i \text{Var}\left( X_i\right) +\sum\limits_i \sum\limits_{j\not=i} \text{Cov}\left( X_i,X_j\right)$$ is maximised when the covariances take their maximum possible positive values, which happens when all the correlations are $+1$.

So the highest variance case for $\sum X_i$ and thus $\bar X$ will be when there is perfect positive correlation between the $X_i$, in which case $$\text{SD}(\sum X_i) = \sum \text{SD}(X_i)$$ giving $$\text{Var}(\bar X) = \frac1{n^2} \text{Var}\left(\sum X_i\right)=\left(\frac1n \sum \text{SD}(X_i)\right)^2 .$$

Then, using the Cauchy–Schwarz inequality:

$$\left(\frac1n \sum \text{SD}(X_i)\right)^2 \le \frac1n \sum \left(\text{SD}(X_i)^2\right) = \frac{1}{n}\sum \text{Var}(X_i)$$

with equality only when all the $\text{SD}(X_i)$ are equal.

So your $\text{Var}(\bar{X}) \leq \frac{1}{n}\sum \text{Var}(X_i)$ is correct,

with equality only when $X_i-E[X_i]=X_j-E[X_j]$ for all $i,j$ so when you have identical variances and perfect positive correlation (though possibly different expectations).

$\endgroup$
2
  • 1
    $\begingroup$ While the situation described in the first sentence seems plausible, the description does not mathematically justify its correctness. $\endgroup$ Commented Jul 7, 2024 at 17:10
  • $\begingroup$ @GregMartin it is well known. I have added an additional introductory line $\endgroup$ Commented Jul 7, 2024 at 17:20

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.