0

I have two data frames which dimension are 24,523 × 3,468, and I want to get the scatter plot of them (data frame 1 in axis x and data frame 2 in axis 2)> then, I want to add a Loess line.

I can simply use a function plot() to get the scatter plot, but I do not know how to add a Loess line to the plot. Furthermore, I found that if the data for each axis is one vector only, instead of data frames, it can be done directly using a function called stat_smooth() in ggplot2 package.

My question is 1) how to get a scatter plot of two data frames using a function ggplot()? 2) How to add a Loess line to a scatter plot generated using two data frames?

This is scatter_plot what I get using

plot(as.matrix(spatial_data_glio_df_intersection_genes), as.matrix(estimated_all_gene_read_counts_spatial), xlab = "true_gene_read_counts", ylab = "estimated_gene_read_counts") 

The data can be accessed using the link data.

5
  • @csgroen It does not, unfortunately. Your suggestion deals with plotting the data of two single vectors, right? While my data are two data frames. Commented Jul 12, 2022 at 10:18
  • Please post some minimal data: stackoverflow.com/questions/5963269/… Commented Jul 12, 2022 at 11:42
  • @harre I provided the link to access the data, cant you access it? Commented Jul 12, 2022 at 11:46
  • Each file is 40MB. A more minimal example would be appreciated Commented Jul 12, 2022 at 11:53
  • @harre Ow, that is what you meant. I will try to provide one. Thank you. Commented Jul 12, 2022 at 12:11

1 Answer 1

2

Just linearize the two data frames with as.vector(). I've made a minimum reproducible example using random data. The first plot corresponds more or less to what you have currently, the second one hopefully corresponds to the desired output:

library(ggplot2) df1 <- matrix(rnorm(1000), nrow = 100) df2 <- matrix(rnorm(1000), nrow = 100) plot(df1, df2, xlab = "true_gene_read_counts", ylab = "estimated_gene_read_counts") joint_df <- data.frame(df1 = as.vector(df1), df2 = as.vector(df2)) ggplot(joint_df, aes(df1, df2)) + geom_point() + geom_smooth(method = "loess") + labs(x = "true_gene_read_counts", y = "estimated_gene_read_counts") + theme_linedraw() 

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

I see so the idea is to change the dataframe into vector. I got it. Yes this is what I expected. I will try using my data to see how it turns out. Thank you very much.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.