5
$\begingroup$

Let's say I have two independent random samples $X_1, X_2, \dots, X_n$ and $Y_1, Y_2, \dots, Y_n$ from normal distributions with real, unknown means $\mu_x$ and $\mu_y$ and known standard deviations $\sigma_x$ and $\sigma_y$.

How would I go about deriving a $100(1 - \alpha)$% confidence interval for $\mu_x - \mu_y$? This is straight forward (in my mind) assuming the standard deviations are equal, but what if they are unequal?

$\endgroup$
3
  • 1
    $\begingroup$ You know the term 'pivotal quantity' ? $\dfrac {\bar X - \bar Y -\mu_x + \mu_y}{\sigma_x^2/n + \sigma_y^2/m}$ is one, and you cna get a confidence interval form that $\endgroup$ Commented Aug 16, 2012 at 15:42
  • $\begingroup$ @mike : What you suggest is workable only when the two population variances are known. The bound of the confidence interval will depend on them. $\endgroup$ Commented Aug 16, 2012 at 16:14
  • $\begingroup$ I see: He did say they're known. $\endgroup$ Commented Aug 16, 2012 at 16:49

2 Answers 2

7
$\begingroup$

Alright, you say known variances. So it's an exercise on a point of theory, not a realistic problem.

And you actually assume the two sample sizes are equal.

Start by recalling something from the one-sample problem: $$ \bar{X} = \frac{X_1+\cdots+X_n}{n} \sim N\left(\mu_X,\frac{\sigma^2_X}{n} \right) $$ $$ \bar{Y} = \frac{Y_1+\cdots+Y_n}{n} \sim N\left(\mu_Y,\frac{\sigma^2_Y}{n} \right) $$ You don't explicitly state that the two samples are independent. If they are, they we have $$ \bar X - \bar Y \sim N\left(\mu_X-\mu_Y,\frac{\sigma^2_X+\sigma^2_Y}{n}\right) $$ (If we had unequal sample sizes $n$ and $m$, then the variance would be $\dfrac{\sigma^2_X}{n}+\dfrac{\sigma^2_Y}{m}$.)

So $$ \frac{((\bar X-\mu_X) - (\bar Y-\mu_Y))\sqrt{n}}{\sqrt{\sigma^2_X+\sigma^2_Y}} \sim N(0,1). $$ So the probability that $$ -A < \frac{(\bar X-\mu_X) - (\bar Y-\mu_Y)}{\sqrt{ \frac{\sigma^2_X+\sigma^2_Y}{n} }} <A \tag{1} $$ is the desired confidence when the number $A$ is suitably chosen. Now do a bit of algebra to rearrange the inequalities $(1)$: $$ \bar X - \bar Y - A\sqrt{\frac{\sigma^2_X+\sigma^2_Y}{n}} < \mu_X-\mu_Y < \bar X - \bar Y + A\sqrt{\frac{\sigma^2_X+\sigma^2_Y}{n}} $$ That's the confidence interval.

$\endgroup$
2
  • $\begingroup$ This was here for a few minutes without the factor or $A$ in two places in the last line. Now I hope it's correct. $\endgroup$ Commented Aug 16, 2012 at 17:02
  • $\begingroup$ I have question about this answer. I am interested in calculating the confidence interval from combining the coefficients of two separate curve fits. Say I have two curve fits which give $R^{\alpha\pm A\sigma_\alpha}$ and $R^{\beta\pm A\sigma_\beta}$ where R is the independent variable in the fit. In my case, I am unsure what the sample size n should be. $\endgroup$ Commented Jan 18, 2020 at 0:27
2
$\begingroup$

When the standard deviations are unequal the inference problem of comparing two means is often called the Behrens - Fisher problem. The pivotal quantity for testing or constructing confidence intervals is a "t-like" statistic gotten by taking the difference of the two sample means and dividing by the sample estimate of the standard error of the mean difference. The standard error is a function of the two unknown standard deviations and and the sample sizes used. The estimate involves replacing the unknown variances with their sample estimates. The distribution of the test statistic under the null hpothesis that the means are equal is sometimes called Welch's distribution. It can be approximated by a t distribution whoses degrees of freedom are fractional (not necessarily an integer). This approximation is called the Satterthwaite approximation. This Wikipedia link provides the detailed information:

Welch-Satterthwaite Approximation .

If the variances are unequal and known then the pivotal quantity to use is what is given by Mike and it will have a standard normal distribution. In practice the variance are not known unless you have knowledge that they are the same as variances that have been previously estimated based on very large samples.

$\endgroup$
3
  • $\begingroup$ note that his $\sigma$s are known $\endgroup$ Commented Aug 16, 2012 at 15:45
  • $\begingroup$ I see that the OP states that but if the variances are known how can you not know whether or not the are equal? In the last sentence the OP seems to be asking what to do if you do not know that the variances are equal. $\endgroup$ Commented Aug 16, 2012 at 16:04
  • $\begingroup$ Sorry guys -- poorly worded on my part. Edited. $\endgroup$ Commented Aug 16, 2012 at 16:07

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.