I am using R to create size frequency histograms for diseased and healthy individuals with fitted normal distribution lines. I have 2 issues that I'm seeking advice on.
- How do I create a histogram from aggregated data? The example table below has the summarized number of diseased and healthy individuals within each size.
dput(data)
'structure(list(Size = c(25L, 28L, 31L, 45L, 60L), diseased = c(0L, 22L, 10L, 5L, 2L), healthy = c(55L, 40L, 15L, 7L, 2L)), .Names = c("Size", "diseased", "healthy"), class = "data.frame", row.names = c(NA, -5L))' 2.How do I overlay both histograms into 1 figure with fitted normal distribution lines.
I have tried the following code for aggregated data ggplot(data,aes(x=Size,y=diseased))+geom_bar(stat='identity'), which works well, but I can't figure out how to add the histogram for the healthy individuals.
I have also tried using the following text to revert the summarized data (called "data") to the original raw format: raw <- data[rep(1:data, times=data$diseased), "Size", drop=FALSE]
I get the following error message: Error in rep(1:data, times=data$diseased) : invalid 'times' argument. From previous comments, it appears that the rep function can't handle "0"

dput(data)ordput(head(data))? Also, how can your columns have different numbers of rows?Sizedon't match that ofdiseaseandhealthy. And the OP has given a data.frame as input inggplotbut taken the trouble to provide separate data...