Intuition behind unbiased OLS estimator derivation

Question

I was going through the derivation of unbiased OLS estimator

$$E(\hat{\beta_1}) = \beta_1 + (1/SST_x) \sum_{i=1}^n d_i E(u_i) = \beta_1 + (1/SST_x) \sum_{i=1}^n d_i\cdot 0 = \beta_1$$

My doubt is if $$ d_i = x_i - \bar{x} $$ and assuming Gauss-Markov theorem is true that $x_i$ is independent of $u_i$.

Then how, $$ (1/SST_x) E(\sum_{i=1}^n d_i u_i) = (1/SST_x) \sum_{i=1}^n d_i E(u_i) $$

how can we treat $d_i$ as constant and we didn't take expectation of $d_i$. What is the intuition behind this?

Your equations seem to be for simple linear regression, instead of the more general ordinary least squares method. — Sextus Empiricus
– Sextus Empiricus, Commented Jul 30, 2020 at 8:14
Yes. I just want to understand from what intuition $d_i = x_i - \bar{x}$ is considered constant in this equation. — Mayank Mittal
– Mayank Mittal, Commented Jul 30, 2020 at 8:19

Sextus Empiricus · Accepted Answer · 2020-07-30 13:36:30Z

You can see this as the expectation being unbiased conditional on $x$, in which case the $d_i$ can be considered constants.

(and since the conditional expectation is zero, so will be the unconditional expectation)

The derivation in your question is only correct for constant $x$.

But it happens to be the same result for variable $x$, because the expectation conditional on $x$ is zero independent of it's value. (but for this extra step, from constant to conditional, the derivation is indeed not complete)

See here an intuitive plot how the marginal distribution of $y$ can be considered as a sum of the distributions of $y$ conditional on $x_1$.

The data here has been generated according to some silly/fancy model:

$$\begin{array}{} x_1 &\sim& N(\mu = 10, \sigma = 1) \\ y \vert x_1 &\sim & N(\mu = 0, \sigma = 2+ \sin(5x_1)) \end{array}$$

You can consider the distribution of $y$ as a sum of the distributions/components of $y$ conditional on $x$. When each of these components has a mean equal to zero, then the sum of them will also be zero.

### settings for layout and computation set.seed(1) layout(matrix(1:2,1), widths = c(2,1)) par(mar= c(4,4,2,1), mgp = c(2.5,0.8,0)) ### generate data x <- rnorm(10^4,10,1) y <- rnorm(10^4,0,2+sin(x*5)) ### conditional colouring col <- hsv(x/14,1,1,0.1) ### scatterplot plot(x,y, col = col, bg = col, pch = 21, cex = 0.7, ylim = c(-10,10), main = "scatter plot \n", cex.main = 1, xlab = expression(x[1])) ### density plots for different conditions x plot(-100,-100, ylim = c(-10,10), xlim = c(0,220), xlab = "", ylab = "", main = "conditional \n distribution", cex.main = 1) for (xs in seq(7,13,0.4)) { sel = ((x>xs-0.2)*(x<xs+0.2)) h <- hist(y[sel==1], breaks = seq(-15,15,0.4), plot = FALSE) lines(h$counts,h$mids, col = hsv(xs/14,1,1,0.5), lwd = 2) }

how $d_i$ is constant for given x? I know that $\sum_{i=1}^nd_i$ is zero. But here it is $\sum_{i=1}^n d_i u_i$. So, how $E[\sum_{i=1}^n d_i u_i | X]$ makes d_i constant? Is there any proof or explanation that makes $d_i$ constant for given x? — Mayank Mittal
– Mayank Mittal, Commented Jul 30, 2020 at 7:41
how $d_i$ is constant for given $x$? Because $d_i$ is a function of $x$. If $x$ is constant, then $d_i$ is constant. — Sextus Empiricus
– Sextus Empiricus, Commented Jul 30, 2020 at 8:11
$d_i$ = $x_i - \bar{x} $ where $\bar{x}$ is mean of X. Thus, $\sum_{i=1}^n (x_i - \bar{x}) = 0$, all $x_i $ are independent and identically distributed. — Mayank Mittal
– Mayank Mittal, Commented Jul 30, 2020 at 8:15
When $x$ is constant, I mean that all $x_i$ are constant. It might be that you consider a case with varying $x_i$, but for a conditional expression you can consider the $x_i$ and $d_i$ to be constant. To go from constant $x_i$ to non constant $x_i$ you can use: $$E(a) = \sum_{\forall b} E(a | b) p(b)$$ So if the conditional expressions are zero (for constant x) then so will be the marginal expression (for varying x). — Sextus Empiricus
– Sextus Empiricus, Commented Jul 30, 2020 at 8:18
So the trick for varying x is to compute the expectation of beta for each individual (constant) x, and then get the overall expectation by summing/integrating over all the possible states of x (and weighing their probability). This sum is easy because each conditional term is zero. — Sextus Empiricus
– Sextus Empiricus, Commented Jul 30, 2020 at 8:24

Stack Exchange Network

Intuition behind unbiased OLS estimator derivation

1 Answer 1

Hot Network Questions

Intuition behind unbiased OLS estimator derivation

1 Answer 1

Related

Hot Network Questions