3
$\begingroup$

I have a question with regard to the implementation of Bang & Robins (2005) double robust estimator of a treatment effect (formula at the bottom of page 964). The idea of their estimator is to estimate the propensity score and then to use the inverse of this propensity score in a regression model.

If I implement the formula as described in the paper, I do not get the correct causal effect (in a simulated example), even if I specify the treatment model (the propensity score model) and the outcome model correctly. Here is a simulated example in R:

 # simulate a example: set.seed(123) n <- 10000 s <- matrix( 0.3, 3, 3 ); diag(s) <- 1 X <- mvtnorm::rmvnorm( n, rep(0,3), s ) # the covariates dfs <- data.frame( X ) # treatment model: pr <- 1/ ( 1 + exp( - (-1 + 0.3*dfs$X1 + 0.2*dfs$X2 + 0.1*dfs$X3 ) ) ) dfs$D <- rbinom( n, 1, pr ) # outcome model: dfs$Y <- with( dfs, 0.3*D + 0.2*X1 + 0.2*X2 + 0.2*X3 + rnorm( n, 0, 1 ) ) 

and here is my code for the formula (I tried to stick to the notation in the paper; D = Delta, V = X = the covariates, pi = p = propensity score):

 # propensity scores: fitPS <- glm( D ~ X1 + X2 + X3, data = dfs, family = binomial("logit") ) dfs$p <- predict( fitPS, data = dfs, type = "response" ) # outcome regression model with weights: dfs$w <- ifelse( dfs$D == 1, 1/dfs$p, 1/(1-dfs$p) ) # D = 1 = Treatment group, 0 = controls fitOR <- lm( Y ~ D + X1 + X2 + X3 + w, data = dfs ) # get predicted values (called e in the paper): D <- dfs$D # save D for later dfs$D <- 0 E0 <- predict( fitOR, newdata = dfs ) dfs$D <- 1 E1 <- predict( fitOR, newdata = dfs ) # the formula: dfs$D <- D dfs$e1 <- ifelse( dfs$D == 1, E1, 0) dfs$e0 <- ifelse( dfs$D == 0, E0, 0) mu <- mean(dfs$e1) - mean(dfs$e0) mu 

I get a mu of 0.15 which is half of the true effect.

A side note: In their book Hernan and Robins also suggest this estimator but they say that one has to use 1/prop_score in the treatment group and (-1)*1/(1-prop_score) in the control group for the weights in the regression model above. If I implement this, the estimator is the same (about 0.15).

To be honest, I am confused. I don't understand what I'm doing wrong. Perhaps anyone of you has an idea?

Thanks Stefan

$\endgroup$

1 Answer 1

3
$\begingroup$

The last few lines are incorrect. The ATE is estimated by mean(E1) - mean(E0). With this change, the answer you'll get is 0.3411855, which is closer to the truth. Using a larger sample size or repeating the experiment in a Monte Carlo simulation yields the correct answer, indicating the method is indeed unbiased and consistent.

Note that because you programmed the data so that there is no effect modification by the covariates, the coefficient on D in fitOR is equal to the ATE.

$\endgroup$
2
  • 1
    $\begingroup$ Thanks!!! The last lines are rubbish from programming the Davidian & Lunceford estimator; so I guess I was too attached to it. Thanks again! $\endgroup$ Commented Apr 30, 2020 at 19:07
  • $\begingroup$ No problem. You can mark this question as solved if you're satisfied with the solution. $\endgroup$ Commented Apr 30, 2020 at 19:22

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.