I am working on statistical inference with instrumental variables (IV) following Wooldridge (2016) Introductory Econometrics, Ch. 15. I am using the Card data set (like the book), with wages as outcome ($y$), education as a endogenous continuous treatment ($x$) and distance to college as a binary IV ($z$).
I want to calculate the standard errors manually, and preferably additionally in matrix form using Mata. So far, I am able to calculate coefficients but I can't seem to obtain the correct standard errors and would be happy for input on this.
I obtain the point estimate for $\beta_{IV}$ with the Wald-estimator:
$\beta_{IV}=\frac{\mathbb{E}[y | z = 1]-\mathbb{E}[y | z = 0]}{\mathbb{E}[x | z = 1]-\mathbb{E}[x | z = 0]}$,
$\beta_{IV}=\frac{6.311401-6.155494}{13.52703-12.69801}=.18806$
Cross-checked with Stata's -ivregress-:
. ivregress 2sls y (x=z), nohe ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | .1880626 .0262826 7.16 0.000 .1365496 .2395756 _cons | 3.767472 .3487458 10.80 0.000 3.083943 4.451001 ------------------------------------------------------------------------------ I now want to proceed by calculating the standard errors. Wooldridge (2016, p. 466) writes that standard errors for $\beta_{IV}$ is obtained by using the square root of the estimated asymptotic variance, where the latter is obtained by
$Var(\beta_{IV})=\frac{\sigma^{2}}{SST_{x} \cdot R^{2}_{x,z}}$
First, $SST_{x}$ is the total sum of squares for $x_{i}$, calculated by
. use http://pped.org/card.dta, clear // Load Card data set . rename nearc4 z . rename educ x . rename lwage y . * SSTx . egen x_bar = mean(x) . gen SSTx = (x-x_bar)^2 . quiet sum SSTx . di r(sum) 21562.08 Second, $R^{2}_{x,z}$ is obtained from the regression output,
. reg x z, nohe ------------------------------------------------------------------------------ x | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- z | .829019 .1036988 7.99 0.000 .6256912 1.032347 _cons | 12.69801 .0856416 148.27 0.000 12.53009 12.86594 ------------------------------------------------------------------------------ . di .829^2 .687241 Finally, $\sigma^{2}$ is the error variance given by $SSE/(n-k-1)$ where the squared estimate of errors (SSE) is obtained by $SSE = \sum{(y_{i}-\hat{y_{i}})^{2}}$. Wooldridge says to use the IV residuals $\hat{u_{i}}$ in calculating the error variance,
$\sigma^{2}=\frac{1}{(n-2)} \sum{\hat{u_{i}}^2}$
Which I calculate in Stata as,
. quiet reg x z . predict x_hat (option xb assumed; fitted values) . quiet reg y x_hat, nohe . predict iv_resid (option xb assumed; fitted values) . quiet sum iv_resid . di r(sum) 18848.115 . di (18848.114)^2 3.553e+08 . gen sigma_squared = 3.553e+08 . tabstat sigma_squared, format(%20.2f) variable | mean -------------+---------- sigma_squa~d | 355300000.00 ------------------------ . di (1/(3010-2))*355300000 118118.35 Thus, when finally I substitute the values into the equation for the variance of $\beta_{IV}$, I get
$Var(\beta_{IV})=\frac{118118.35}{21562.08 \cdot .687241}=7.9711$
I then calculate the standard error by following the formula for standard error (e.g. Wooldridge 2016, p. 50):
$\hat{\sigma} = \sqrt{\hat{\sigma}^{2}} \implies \sqrt{7.9711}=2.8233$
$se(\beta_{IV})=\frac{\sigma}{\sqrt{SST_{x}}} \implies \frac{2.8233}{\sqrt{21562.08}}=0.01922 $
I have used quite some time on this and it would really be helpful with some input on what I am doing wrong.
EDIT: Based on the formula provided by Drunk Deriving, I've tried to calculate SE in Mata
. use http://pped.org/card.dta, clear . keep nearc4 educ lwage id . rename nearc4 Z . rename educ X . rename lwage y . mata : y=st_data(.,"y") : X=st_data(.,"X") : Z=st_data(.,"Z") : X = X, J(rows(X),1,1) // Add constant : Z = Z, J(rows(Z),1,1) // Add constant : b_iv = luinv(Z'*X)*Z'*y : e=y-X*b_iv : v=luinv(Z'*X)*Z'e*e'*Z*luinv(Z'*X) : xmean = mean(X) : tss_x = sum((X :- xmean) :^ 2) : se=sqrt(v)/tss_x : t=b_iv:/se : p=2*ttail(rows(X)-cols(X),abs(t)) : b_iv,se,t,p 1 2 3 4 5 6 7 +---------------------------------------------------------------------------------------------------+ 1 | .1880626042 . 1.69178e-17 . 1.11162e+16 . 0 | 2 | 3.767472015 4.17102e-17 . 9.03251e+16 . 0 . | +---------------------------------------------------------------------------------------------------+ : end