3
$\begingroup$

I am thinking of using a weighted least square method and checking the meaning of this approach. In particular, I am checking how the method in lm(., weights = w) matches the theoretical equation. The estimate can be calculated as follows:

$$ \hat{\beta} = (X'WX)^{-1}X'WY$$

I can get this value using the following code. But, I could not get the matched value of the standard error. In particular, I use

\begin{equation} \begin{split} Var[\hat\beta|X] &= E[(X'WX)^{-1}X'We\cdot ((X'WX)^{-1}X'We)']\\ &= E[(X'WX)^{-1}X'We\cdot e' W' X (X'WX)^{-1}]\\ &= (X'WX)^{-1}X'WE[e\cdot e'|X] W' X (X'WX)^{-1}] \end{split} \end{equation} If $E[e\cdot e'|X] = \sigma^2$, I would get Case 1 on the following code. If I use $E[e\cdot e'|X] = \sigma_i^2$, I would get Case 2. Both cases does not match the result from the r function. How do I match this value?

x0 = c(1,1,1,1,1,1,1,1,1,1) x1 = c(1,1,1,1,1,0,0,0,0,0) x = cbind(x0, x1) y = c(10,9,8,7,6,5,4,3,2,1) w = c(0.5, 0.5, 0.5, 0.5, 0.5, 1, 1, 1, 1, 1) summary(lm(y ~ x1, weights = w)) # x1 Estimate: 5.0000 Std.Error: 1.0607 ## By hand n = length(x1) k = ncol(x) W = matrix(diag(w),ncol=n) beta = solve(t(x) %*% W %*% x) %*% t(x) %*% W %*% y mse = sum((y - t(t(beta) %*% t(x)))^2)/(n-k) mse2 = c((y - t(t(beta) %*% t(x)))^2) e = matrix(diag(mse2),ncol=n) sqrt(mse * (solve(t(x) %*% W %*% x) %*% t(x) %*% W %*% t(W) %*% x %*% solve(t(x)%*%W %*% x))[2,2]) ## Case1: Std.Error = 1 sqrt((solve(t(x) %*% W %*% x) %*% t(x) %*% W %*% e %*% t(W) %*% x %*% solve(t(x)%*%W %*% x))[2,2]) ## Case2: Std.Error = 0.89 
$\endgroup$

1 Answer 1

5
$\begingroup$

The first step is applying the mathfact $\text{var}(AX) = A\text{var}(X)A^T$ for $A$ a constant and $X$ a matrix (or vector) of random variables. Using this you get:

$ \begin{eqnarray} \text{var}(\hat{\beta}) &=& (X^TWX)^{-1}X^TW \text{var}(Y) WX(X^TWX)^{-1} \end{eqnarray} $

And, if $W$ is selected so that $\text{var}(Y) = W^{-1} \sigma^2$, then

$ \begin{eqnarray} &=& \sigma^2(X^TWX)^{-1} \end{eqnarray} $

which one can show to be the best linear unbiased estimator (BLUE) as per Gauss Markov.,

You can replicate the output from LM modifying your code as below:

fit <- lm(y ~ x1, weights = w) summary(fit) # x1 Estimate: 5.0000 Std.Error: 1.0607 sigma2 <- sum(w*fit$residuals^2)/{length(x0)-2} xtwx <- t(x)%*%diag(w)%*%x diag(solve(xtwx)*sigma2)^.5 
$\endgroup$
6
  • $\begingroup$ I know this fact. I can't understand the meaning "if $W$ is selected so that $var(Y) = W^{-1}\sigma^2$." because $W$ is defined. And I think the first equation is the same as in my equation. This yields 1 or 0.89, which does not match 1.0607. $\endgroup$ Commented Dec 14, 2021 at 19:54
  • $\begingroup$ @hrkshr It is not the same as your equation, I'm afraid your derivation is too wrong to correct. For 1. there should be no "expectation" operator unless you're applying var(x) = e(x^2) - e(x)^2. Also $X$ is not random, or even if it was you already conditioned on the X in the LHS of your first display. Lastly, the variance of the $Y$ - conditional on $X$ - is not $e$, it is $\sigma^2$. $\endgroup$ Commented Dec 14, 2021 at 21:21
  • $\begingroup$ I see your point. But how do I match the value, which is the main question. $\endgroup$ Commented Dec 14, 2021 at 21:26
  • $\begingroup$ @hrkshr I provided the code snippet to address the problem. $\endgroup$ Commented Dec 14, 2021 at 21:35
  • $\begingroup$ Thanks. So, if $W$ is not selected so that $var(Y) = W^{-1}\sigma^2$, we should not use lm(., weight), right? And, to calculate sigma2, we have to use w? $\endgroup$ Commented Dec 14, 2021 at 21:50

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.