0

I'd like to plot histogram and density on the same plot. What I would like to add to the following is custom y-axis label which would be something like sprintf("[%s] %s", ..density.., ..count..) - two numbers at one tick value. Is it possible to obtain this with scale_y_continuous or do I need to work this around somehow?

Below current progress using scales::trans_new and sec_axis. sec_axis is kind of acceptable but the most desirable output is as on the image below.

set.seed(1) var <- rnorm(4000) binwidth <- 2 * IQR(var) / length(var) ^ (1 / 3) count_and_proportion_label <- function(x) { sprintf("%s [%.2f%%]", x, x/sum(x) * 100) } ggplot(data = data.frame(var = var), aes(x = var, y = ..count..)) + geom_histogram(binwidth = binwidth) + geom_density(aes(y = ..count.. * binwidth)) + scale_y_continuous( # this way trans = trans_new(name = "count_and_proportion", format = count_and_proportion_label, transform = function(x) x, inverse = function(x) x), # or this way sec.axis = sec_axis(trans = ~./sum(.), labels = percent, name = "proportion (in %)") ) 

I've tried to create object with breaks before basing on the graphics::hist output - but these two histogram differs.

bins <- (max(var) - min(var))/binwidth hdata <- hist(var, breaks = bins, right = FALSE) # hist generates different bins than `ggplot2` 

At the end I would like to get something like this:

enter image description here

2 Answers 2

1

Would it be acceptable to add percentage as a secondary axis? E.g.

your_plot + scale_y_continuous(sec.axis = sec_axis(~.*2, name = "[%]")) 

Perhaps it would be possible to overlay the secondary axis on the primary one, but I'm not sure how you would go about doing that.

Example plot

Sign up to request clarification or add additional context in comments.

3 Comments

Above doesn't solve issue, second scale still lacks information about density. It is still a ..count...
Sorry, you are right - you would have to change the scaling factor to convert from count to density. The relevant change would be ... sec_axis(~.*SCALE, ....
Yes, it is still a acceptable solution. I'm trying to do this with trans_new but formatter is applied at the end and it only gets final breaks.
0

You can achieve your desired output by creating a custom set of labels, and adding it to the plot:

library(tidyverse) library(ggplot2) set.seed(1) var <- rnorm(400) bins <- .1 df <- data.frame(yvals = seq(0, 20, 5), labels = c("[0%]", "[10%]", "[20%]", "[30%]", "[40%]")) df <- df %>% tidyr::unite("custom_labels", labels, yvals, sep = " ", remove = TRUE) ggplot(data = data.frame(var = var), aes(x = var, y = ..count..)) + geom_histogram(aes(y = ..count..), binwidth = bins) + geom_density(aes(y = ..count.. * bins), color = "black", alpha = 0.7) + ylab("[density] count") + scale_y_continuous(breaks = seq(0, 20, 5), labels = df$custom_labels) 

enter image description here

1 Comment

I can't assume y breaks and y values before. I've tried by using graphics::hist before but I can't reproduce the same output as in ggplot2

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.