3
$\begingroup$

I don't understand why the standard error of the mean does not depend on the number of samples of the mean that you take. To clarify, let's use a simplified version of the example in this answer. Two samples of size 10 and 100 respectively are taken from the same set of values and the mean of each sample is computed. We expect those means to be different and the standard error describes the variability in those estimates of the true population mean. My assumption was that is you took 100 samples of variable size, you'd get 100 estimates of the mean which would reduce you standard error. But the number of samples never appear in the standard error formula : $\sigma/n$.

In terms of intuition, the standard error of the mean describes the variability of the estimated mean. But if you compute the mean over the whole sample instead of several sub-samples as I understand it, you only have 1 estimate. What does the variability of 1 single value represent ?

In terms of maths, let's call $m$ the number of samples (2 in the example above) and $n_i$ the size of sample i (10 and 100 in the example) with $i \in [1,m]$. Let's also call the total sample size $n$ with $n = \sum_i^m n_i$. Using the formula for the standard deviation, I would have written the standard error, i.e, the standard deviation of the mean, as :

$$SE = \sqrt{ \frac{\sum_{i=1}^m (\bar{x_i} - \bar{x})^2}{m}} $$

with $\bar{x_i} = \frac{\sum_{j = 1}^{n_i} x_j}{n_i}$ the mean of sample i.

But I can't figure out how to go from this version to the equation for the SE nor why $m$ does not appear in the final formula. This answer clarifies some of the math but it does not got to the full length to show if $\frac{1}{N}(\frac{\sigma^2}{n} + \sigma_G^2)$ is actually equal to $\sigma / n$. Considering that the mean of the whole sample is not necessarily equal to the mean of the means of sub-samples, I'm thinking maybe the two aren't equal actually. But at that point I'm a bit lost haha.

Thank you for your help.

$\endgroup$
2
  • 1
    $\begingroup$ I'm not sure what your question is. Are you asking if you have made a math error? Or about the definition? Or what? $\endgroup$ Commented Nov 1 at 10:38
  • $\begingroup$ Hey, Thank you for taking the time to read my question. I am asking about the definition. Why does m not appear in the formula ? If the standard error is the standard deviation of multiple means, why doesn't the number of means (m) appear anywhere ? $\endgroup$ Commented Nov 14 at 20:37

2 Answers 2

2
$\begingroup$

You seem to be a little confused here, so I will try to deliniate what I think you're asking and then answer:

  1. Standard error of the mean: this term is usually reserved for the standard deviation of the sampling distribution of the sample mean. The first answer you link to demonstrates how the the $\sqrt{\frac{\sigma^{2}}{n}} = \frac{\sigma}{\sqrt{n}}$ results from $\text{i.i.d.}$ assumptions and variances of weighted sums. The important point here is that $n$ in the denominator is fixed since each sample size has it's own distribution: we are computing an estimate of the variation of sample means from the true mean for a fixed $n$.
  2. Estimating the global mean across a set of subsamples: this will vary based on the approach you take (design based v. model based) as well as, in a model-based situation, how you actually want to set up your model. The second answer you linked to uses an ANOVA decomposition to separate the variance within the groups from the variance between the groups. These are usually referred to as Mean Squares, and their square root as root mean square error (not always). The computation is greatly simplified if all groups share the same number of observations, which, in that answer is little $n$. We're still describing variation of sample means from a common mean, but in this second case, we have two sources of variation: the variation within each group $\sigma^{2}_{i}$ and the between groups variation, $\sigma^{2}_{G}$. So there is no single standard error computation in this case.
$\endgroup$
0
$\begingroup$

There may be some confusion in my understanding of the question. Bottom line: If you sample one independent item at a time from a population, and take the standard deviation over a number of such independent one-item samples, you will obtain an estimate, say S, of the population's standard deviation. If you take N independent samples from the population, take their mean, and repeat the process many times, you will find that the standard deviation of these means quickly approaches S / SQRT(N). In other words, as a mean is calculated over a larger number of independent samples from a population, the standard deviation (error) of the mean becomes smaller. And, the mean, as an estimate of the population mean, becomes better.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.