I have a continuous variable on y, and a categorical on x axis. At the categorical variable the order makes sense, and it would make sense to fit a regression by its index, I mean instead of c('a', 'b', 'c') use the indices (order(c('a', 'b', 'c')), which is c(1, 2, 3)), and fit the model against this. However, ggplot rejects to fit a geom_smooth(method = lm) if one variable is not numeric. Ok, then I can tell it that use the order:
geom_smooth(aes(x = order(hgcc), y = rtmean), method = lm) But then it takes the indices of the whole column from the data frame, which is not good with faceting with scales = 'free', when only a subset of the levels of the x variable appears on one plot. The indexes in the whole dataframe are much higher in average, so the regression will be plotted far on the right:
Here is a minimal working example:
require(ggplot2) load(url('http://www.ebi.ac.uk/~denes/54b510889336eb2591d8beff/sample_data.RData')) ggplot(adata12cc, aes(x = hgcc, y = rtmean, color = cls, size = log10(intensity))) + geom_point(stat = 'sum', alpha = 0.33) + geom_smooth( aes(x = order(hgcc), y = rtmean), method = 'glm') + facet_wrap( ~ uhgroup, scales = 'free') + scale_radius(guide = guide_legend(title = 'Intensity (log)')) + scale_color_discrete(guide = guide_legend(title = 'Class')) + xlab('Carbon count unsaturation') + ylab('Mean RT [min]') + ggtitle('RT vs. carbon count & unsaturation by headgroup') + theme(axis.title = element_text(size = 24), axis.text.x = element_text(angle = 90, vjust = 0.5, size = 9, hjust = 1), axis.text.y = element_text(size = 11), plot.title = element_text(size = 21), strip.text = element_text(size = 18), panel.grid.minor.x = element_blank()) I know this is not the nice way of doing things, but ggplot could make life so much easier, if I could refer to those variables and do something with them which are subsetted anyways by faceting.

