4
$\begingroup$

Let's imagine the quantile regression (qgam) of the tensor product below.

library("mgcv") library("qgam") library("gratia") library("ggplot2") library("dplyr") library("geomtextpath") # qgam qg0 <- qgam(z0 ~ te(x0, y0, k=c(5,5), bs=c("cr","cr")), data=dat0, qu=0.5) # plot using gratia sm0 <- smooth_estimates(qg0); sm0 dw0 <- draw(qg0) dw0$layers[[2]] <- NULL dw0 + ggtitle(label = "dw0 (partial effect)", subtitle = "te(x0,y0)") + geom_point(data=dat0, aes(x=x0, y=y0), size=3, shape=21, fill="black", alpha=0.3) + geom_textcontour(data=sm0, aes(z = .estimate), fontface = 'bold', linetype = NA) + geom_textcontour(data=sm0, aes(z = .estimate), fontface = 'bold', textcolour = NA) 

enter image description here

Questions:

  1. To be sure, is the 0 line represents the overall median?
  2. In order to model the response instead of the partial effect, and waiting for the announced response argument, is it appropriate to add the constant in the following way? The answer here partially answers the question, but may not be applicable to a bivariate smoothing model (?), and does not provide the detailed code.

enter image description here

# constant b0 <- coef(qg0)[1] # plot using gratia, adding the constant b0 dw1 <- draw(qg0, constant = b0) dw1$layers[[2]] <- NULL dw1 + ggtitle(label = "dw1 (response)", subtitle = "te(x0,y0)") + geom_point(data=dat0, aes(x=x0, y=y0, col=z0), size=3, shape=21, fill="black", alpha=0.3, show.legend = FALSE) + geom_textcontour(data=sm0, aes(z = .estimate + b0), fontface = 'bold', linetype = NA) + geom_textcontour(data=sm0, aes(z = .estimate + b0), fontface = 'bold', textcolour = NA) + theme(legend.title = element_blank()) 

Data:

dat0 <- structure(list(x0 = c(2.44, 2.76, 2.53, 2.31, 2.9, 2.51, 2.82, 2.98, 2.53, 2.42, 2.66, 2.36, 2.78, 2.44, 2.24, 2.31, 2.65, 2.93, 2.71, 2.62, 2.66, 2.37, 2.99, 2.9, 2.7, 2.9, 2.98, 2.55, 2.28, 2.36, 2.19, 2.85, 2.75, 2.65, 2.69, 2.77, 2.71, 2.2, 2.54, 2.52, 2.49, 2.7, 2.83, 2.99, 2.87, 2.66, 2.65, 2.6, 2.16, 2.04, 2.66, 2.11, 2.81, 2.76, 2.88, 2.49, 2.56, 2.87, 2.98, 2.59, 2.91, 2.11, 2.58, 2.72, 2.47, 2.87, 2.51, 2.07, 2.95, 2.66, 2.88, 2.61, 2.62, 2.5, 3, 2.76, 2.91, 2.67, 2.29, 2.83, 2.43, 2.37, 2.78, 3, 2.1, 2.34, 2.73, 2.95, 2.69, 2.88, 2.95, 2.95, 2.83, 2.66), y0 = c(0.7, 1.48, 5.85, 3.47, 0.48, 6.38, 0.64, 2.24, 4.9, 0.29, 4.54, 2.08, 0.9, 0.53, 1.2, 3.95, 2.65, 1.32, 5.59, 1.14, 1.96, 1.06, 5.9, 2.2, 0.54, 3.49, 1.51, 2.5, 0.67, 3.05, 1.47, 2.05, 1.25, 1.01, 1.98, 1.1, 2.72, 1.75, 2.92, 2.83, 0.89, 0.43, 2.25, 0.96, 1.57, 2.92, 3.15, 0.72, 0.41, 3.64, 2.9, 7.48, 0.71, 1.91, 3.19, 0.8, 2.74, 5.05, 1.35, 0.9, 1.15, 1.12, 2.65, 1.76, 3.12, 5.38, 3.47, 1.92, 1.31, 2.29, 0.89, 0.92, 0.8, 2.1, 1.5, 2.69, 2.03, 0.52, 1.79, 3.06, 4.28, 5.13, 0.55, 2.61, 0.67, 2.8, 0.77, 0.61, 4.7, 0.77, 0.54, 4.44, 2.25, 0.46), z0 = c(1.7, 4.09, 14.79, 8.01, 1.4, 16.02, 1.81, 6.66, 12.39, 0.7, 12.04, 4.91, 2.51, 1.29, 2.69, 9.12, 7.02, 3.86, 15.14, 2.98, 5.2, 2.51, 17.66, 6.37, 1.46, 10.12, 4.5, 6.37, 1.52, 7.19, 3.22, 5.84, 3.45, 2.69, 5.32, 3.04, 7.37, 3.86, 7.42, 7.13, 2.22, 1.17, 6.37, 2.87, 4.5, 7.78, 8.36, 1.87, 0.88, 7.42, 7.72, 15.79, 1.99, 5.26, 9.18, 1.99, 7.02, 14.5, 4.03, 2.34, 3.33, 2.37, 6.84, 4.79, 7.72, 15.44, 8.71, 3.98, 3.86, 6.08, 2.57, 2.4, 2.1, 5.26, 4.5, 7.42, 5.9, 1.4, 4.09, 8.65, 10.41, 12.16, 1.52, 7.83, 1.4, 6.55, 2.1, 1.81, 12.63, 2.22, 1.58, 13.1, 6.37, 1.23)), row.names = c(NA, -94L ), class = "data.frame") 

Edit
Representation of the mean (blue) and median (red) lines of x0 and y0 to illustrate Lukas Lohse's answer:
enter image description here

 dw0 + ggtitle(label = "dw0 (partial effect)\nwith means (blue) and medians (red) lines of x0 and y0", subtitle = "te(x0,y0)") + geom_point(data=dat0, aes(x=x0, y=y0), size=3, shape=21, fill="black", alpha=0.3) + geom_textcontour(data=sm0, aes(z = .estimate), fontface = 'bold', linetype = NA) + geom_textcontour(data=sm0, aes(z = .estimate), fontface = 'bold', textcolour = NA) + geom_vline(xintercept = mean(dat0$x0), color="blue", linewidth=1) + geom_hline(yintercept = mean(dat0$y0), color="blue", linewidth=1) + geom_vline(xintercept = median(dat0$x0), color="red", linewidth=1) + geom_hline(yintercept = median(dat0$y0), color="red", linewidth=1) 

Thanks for help.

$\endgroup$
3
  • 2
    $\begingroup$ Side note: with GAMs you need separate model fits for means and for quantiles. With semiparametric ordinal regression you get quantiles, means, and exceedance probabilities all from one fit, and results are $Y$-transformation-invariant except for estimated means. $\endgroup$ Commented Aug 31 at 13:11
  • $\begingroup$ Thank you Frank, but I'm having trouble translating this into practice. Does Lukas's answer illustrate or relate to your note? $\endgroup$ Commented Aug 31 at 20:41
  • $\begingroup$ No. See instead hbiostat.org/rmsc/cony $\endgroup$ Commented Aug 31 at 21:32

1 Answer 1

5
$\begingroup$

Question 1: No, 0 in the partial effect corresponds to the intercept which is 5.8 while median(dat0$z0) is 4.85! What the intercept means depends on the other variables in the regression. Usually it would represent the estimate for all predictor-variables being 0 but for regularized regressions, like in te, mean centering is generally applied and so it correspond to the estimate at the mean values. Therefore the 0-line is the contour line that goes through the point mean(x0), mean(y0).

gratia doesn't install properly on my machine so here it is in more base-R:

library(qgam) qg0 <- qgam(z0 ~ te(x0, y0, k=c(5,5), bs=c("cr","cr")), data=dat0, qu=0.5) summary(qg0) median(dat0$z0) plot(qg0) points(y0~x0, data = dat0) abline(v = c(0, mean(dat0$x0)), h = c(mean(dat0$y0), 0)) 

Simpler version of the partial effect plot with the mean,mean-point marked by lines.

Question 2: Yes, it is appropriate for the point estimates but you have to also add the variance of the intercept estimate if you want confidence intervals or anything of the sort. See for example my answer here: https://stats.stackexchange.com/a/613385/341520

$\endgroup$
6
  • $\begingroup$ Thanks Lukas. Indeed, line 0 passes through the means of x0 and y0 and not through their medians (see Edit). Does this mean that it is not possible to smooth the interaction quantiles (te and ti) with qgam? If so, is there another way to do it? $\endgroup$ Commented Aug 31 at 20:36
  • $\begingroup$ I'm not sure what you are trying to say with "interaction quantile". te, ti work perfectly for estimating conditional quantiles (for you: of z) conditional on x,y with or without interaction between x,y. I can explain this in more detail if you can tell me more about your background but basically the appearance of the means of x,y are purely technical and have no* impact unless you are specifically looking at the partial effect instead of the overall prediction. $\endgroup$ Commented Aug 31 at 21:29
  • $\begingroup$ @denis if you don't have concrete follow up questions I'd really appreciate if you could accept the answer. It's good for my ego :) $\endgroup$ Commented Sep 2 at 19:03
  • $\begingroup$ So regarding question 1 and to summarize, I understand that here the surface of the response z0 generated by 'te' is indeed a median surface (not a mean surface) centered in z-coordinate on the intercept 5.8, but that this surface is x- and y-centered on the means of x0 and y0, is that correct? $\endgroup$ Commented Sep 4 at 4:42
  • 1
    $\begingroup$ Lukas, thank you for your time and these very clear and informative explanations. $\endgroup$ Commented Sep 4 at 21:20

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.