1

I would like to generate a correlation plot with my "True" variable pairs with all of the rest (People variables). I am pretty sure this has been brought up somewhere but solutions I have found do not work for me.

library(ggplot2) set.seed(0) dt = data.frame(matrix(rnorm(120, 100, 5), ncol = 6) ) colnames(dt) = c('Salary', paste0('People', 1:5)) ggplot(dt, aes(x=Salary, y=value)) + geom_point() + facet_grid(.~Salary) 

Where I got error: Error: Column y must be a 1d atomic vector or a list.

I know one of the solutions is writing out all of the variables in y - which I am trying to avoid because my true data has 15 columns.

Also I am not entirely sure what do the "value", "variables" refer to in the ggplot. I saw them a lot in demonstrating codes.

Any suggestion is appreciated!

2
  • y = value has no meaning as there is no value column in your dt. Are you trying to plot salary amounts against number of people in different groups? Commented Mar 1, 2019 at 4:03
  • what I mean is would you like Salary vs. People 1 | Salary vs. Peopl2 ... and so on? Commented Mar 1, 2019 at 4:08

2 Answers 2

1

You want to convert your data from wide to long format using tidyr::gather() for example. Here is a solution using packages in the tidyverse framework

library(tidyr) library(ggplot2) theme_set(theme_bw(base_size = 14)) set.seed(0) dt = data.frame(matrix(rnorm(120, 100, 5), ncol = 6) ) colnames(dt) = c('Salary', paste0('People', 1:5)) ### convert data frame from wide to long format dt_long <- gather(dt, key, value, -Salary) head(dt_long) #> Salary key value #> 1 106.31477 People1 98.87866 #> 2 98.36883 People1 101.88698 #> 3 106.64900 People1 100.66668 #> 4 106.36215 People1 104.02095 #> 5 102.07321 People1 99.71447 #> 6 92.30025 People1 102.51804 ### plot ggplot(dt_long, aes(x = Salary, y = value)) + geom_point() + facet_grid(. ~ key) 

### if you want to add regression lines library(ggpmisc) # define regression formula formula1 <- y ~ x ggplot(dt_long, aes(x = Salary, y = value)) + geom_point() + facet_grid(. ~ key) + geom_smooth(method = 'lm', se = TRUE) + stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~")), label.x.npc = "left", label.y.npc = "top", formula = formula1, parse = TRUE, size = 3) + coord_equal() 

### if you also want ggpairs() from the GGally package library(GGally) ggpairs(dt) 

Created on 2019-02-28 by the reprex package (v0.2.1.9000)

Sign up to request clarification or add additional context in comments.

Comments

0

You need to stack() your data first, probably that's what you have "seen".

dt <- setNames(stack(dt), c("value", "Salary")) library(ggplot2) ggplot(dt, aes(x=Salary, y=value)) + geom_point() + facet_grid(.~Salary) 

Yields

enter image description here

2 Comments

Thanks Jay, that's really helpful. I have another quick question: is it possible to display the graphs into various rows? For instance, for the graph above, only display two graphs per row, then two more the next row, then the last graph.
Yes, you could take a look into grid.arrange as described in this answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.