The answer depends on the covariance between $\beta_2$ and $\beta_3$. You want to test the the null hypothesis that $\beta_2 + \beta_3 = 0$.
To do this, we need to write a vector, $\delta$, which creates $\beta_2 + \beta_3$ upon performing a dot product with the coefficient vector from the model. After some thought, you can see that $\delta = [0, 0, 1, 1]$. We then need to determine the standard error of $\beta_2 + \beta_3$, which is
$$ \sqrt{\delta^T \Sigma \delta} $$
where $\Sigma$ is the covariance matrix from the model. This can be done as follows in R:
# Simulate some data to fit a model set.seed(2) n <- 1000 x1 <- rbinom(n, 1, 0.5) x2 <- rbinom(n, 1, 0.5) eta <- -1 + 0.5*x1 + 0.25*x2 + 0.15*x1*x2 mu <- exp(eta) theta <- 1/2 y <- MASS::rnegbin(n, mu, theta) d <- data.frame(x1, x2) fit <- MASS::glm.nb(y~x1*x2, data=d) Sigma <- vcov(fit) delta <- c(0, 0, 1, 1) beta <- coef(fit) # Get the variance of b2+b3 vr <- delta %*% Sigma %*% delta se <- sqrt(vr) z <- (beta %*% delta) / se p <- 2*(1-pnorm(abs(z))) # 0.01221841
You should also be able to do this with the {marginaleffects} package, as follows:
library(marginaleffects) pred <- predictions( model=fit, newdata=datagrid(x1=1, x2=c(1, 0)), type = 'link', ) # x1 x2 Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % # 1 1 -0.158 0.116 -1.36 0.175 2.5 -0.385 0.0703 # 1 0 -0.587 0.126 -4.66 <0.001 18.3 -0.834 -0.3401 #Columns: rowid, estimate, std.error, statistic, p.value, s.value, #conf.low, conf.high, x1, x2 #Type: link hypotheses(pred, hypothesis = 'pairwise') # Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % # Row 1 - Row 2 0.429 0.171 2.51 0.0122 6.4 0.0935 0.765 #Columns: term, estimate, std.error, statistic, p.value, s.value, #conf.low, conf.high #Type: link
Explanation of {marginaleffects}
The predictions function will first get the predicted values for the cases where $x_1=1, x_2=1$ and $x_1=1, x_2=0$. The values are returned on the scale of the link function thanks to the type=link argument.
The first row is equivalent to
$$ \beta_0 + \beta_1 + \beta_2 + \beta_3 $$
The second row is equivalent to
$$ \beta_0 + \beta_1 $$
Using the hypotheses function, I can test the difference between the first and second row, which is equivalent to
$$ \beta_2 + \beta_3 $$
which is what we want. The package will handle the standard error computation for us, but it is more or less what I have shown above.
Edit:
The easiest way to get the answer with {marginaleffects} is by referencing the coefficients in the model as follows (h/t robertspierre)
hypotheses( model = fit, hypothesis = 'x2 + `x1:x2`=0' ) #> #> Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % #> x2 + `x1:x2`=0 0.429 0.171 2.51 0.0122 6.4 0.0935 0.765 #> #> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
y_regression = np.log(df["y"])in Python/Pandas ory_regression <- log(d$y)in R and then usey_regressionas the variable to which the regression is fit, you are not doing a GLM with a $\log$ link like negative binomial regression (usually) uses. You are estimating $\mathbb E\left[\log\left(y\mid X\right)\right]$ instead of $\log\left(\mathbb E\left[y\mid X\right]\right)$ that the $\log$-link GLM estimates. Depending on your work, either could be appropriate, but they are not the same. $\endgroup$