Two-Sample Confidence Interval for Normal Distributions

Question

Let's say I have two independent random samples $X_1, X_2, \dots, X_n$ and $Y_1, Y_2, \dots, Y_n$ from normal distributions with real, unknown means $\mu_x$ and $\mu_y$ and known standard deviations $\sigma_x$ and $\sigma_y$.

How would I go about deriving a $100(1 - \alpha)$% confidence interval for $\mu_x - \mu_y$? This is straight forward (in my mind) assuming the standard deviations are equal, but what if they are unequal?

You know the term 'pivotal quantity' ? $\dfrac {\bar X - \bar Y -\mu_x + \mu_y}{\sigma_x^2/n + \sigma_y^2/m}$ is one, and you cna get a confidence interval form that — mike
– mike, Commented Aug 16, 2012 at 15:42
@mike : What you suggest is workable only when the two population variances are known. The bound of the confidence interval will depend on them. — Michael Hardy
– Michael Hardy, Commented Aug 16, 2012 at 16:14

Michael Hardy · Accepted Answer · 2021-06-27 20:22:01Z

Alright, you say known variances. So it's an exercise on a point of theory, not a realistic problem.

And you actually assume the two sample sizes are equal.

Start by recalling something from the one-sample problem: $$ \bar{X} = \frac{X_1+\cdots+X_n}{n} \sim N\left(\mu_X,\frac{\sigma^2_X}{n} \right) $$ $$ \bar{Y} = \frac{Y_1+\cdots+Y_n}{n} \sim N\left(\mu_Y,\frac{\sigma^2_Y}{n} \right) $$ You don't explicitly state that the two samples are independent. If they are, they we have $$ \bar X - \bar Y \sim N\left(\mu_X-\mu_Y,\frac{\sigma^2_X+\sigma^2_Y}{n}\right) $$ (If we had unequal sample sizes $n$ and $m$, then the variance would be $\dfrac{\sigma^2_X}{n}+\dfrac{\sigma^2_Y}{m}$.)

So $$ \frac{((\bar X-\mu_X) - (\bar Y-\mu_Y))\sqrt{n}}{\sqrt{\sigma^2_X+\sigma^2_Y}} \sim N(0,1). $$ So the probability that $$ -A < \frac{(\bar X-\mu_X) - (\bar Y-\mu_Y)}{\sqrt{ \frac{\sigma^2_X+\sigma^2_Y}{n} }} <A \tag{1} $$ is the desired confidence when the number $A$ is suitably chosen. Now do a bit of algebra to rearrange the inequalities $(1)$: $$ \bar X - \bar Y - A\sqrt{\frac{\sigma^2_X+\sigma^2_Y}{n}} < \mu_X-\mu_Y < \bar X - \bar Y + A\sqrt{\frac{\sigma^2_X+\sigma^2_Y}{n}} $$ That's the confidence interval.

This was here for a few minutes without the factor or $A$ in two places in the last line. Now I hope it's correct. — Michael Hardy
– Michael Hardy, Commented Aug 16, 2012 at 17:02
I have question about this answer. I am interested in calculating the confidence interval from combining the coefficients of two separate curve fits. Say I have two curve fits which give $R^{\alpha\pm A\sigma_\alpha}$ and $R^{\beta\pm A\sigma_\beta}$ where R is the independent variable in the fit. In my case, I am unsure what the sample size n should be. — WnGatRC456
– WnGatRC456, Commented Jan 18, 2020 at 0:27

Michael R. Chernick · Accepted Answer · 2012-08-16 16:07:20Z

When the standard deviations are unequal the inference problem of comparing two means is often called the Behrens - Fisher problem. The pivotal quantity for testing or constructing confidence intervals is a "t-like" statistic gotten by taking the difference of the two sample means and dividing by the sample estimate of the standard error of the mean difference. The standard error is a function of the two unknown standard deviations and and the sample sizes used. The estimate involves replacing the unknown variances with their sample estimates. The distribution of the test statistic under the null hpothesis that the means are equal is sometimes called Welch's distribution. It can be approximated by a t distribution whoses degrees of freedom are fractional (not necessarily an integer). This approximation is called the Satterthwaite approximation. This Wikipedia link provides the detailed information:

Welch-Satterthwaite Approximation .

If the variances are unequal and known then the pivotal quantity to use is what is given by Mike and it will have a standard normal distribution. In practice the variance are not known unless you have knowledge that they are the same as variances that have been previously estimated based on very large samples.

I see that the OP states that but if the variances are known how can you not know whether or not the are equal? In the last sentence the OP seems to be asking what to do if you do not know that the variances are equal. — Michael R. Chernick
– Michael R. Chernick, Commented Aug 16, 2012 at 16:04

Stack Exchange Network

Two-Sample Confidence Interval for Normal Distributions

2 Answers 2

You must log in to answer this question.

Hot Network Questions

Two-Sample Confidence Interval for Normal Distributions

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions