1

I am trying to create linear correlation graph between to variables using ggplot2:

dput(sum) structure(list(Date = structure(c(15218, 15248, 15279, 15309, 15340, 15371, 15400, 15431, 15461, 15492, 15522, 15553), class = "Date"), Teams = c(87, 142, 173, 85, 76, 76, 93, 49, 169, 139, 60, 120), Scores = c("67101651", "62214988", "63183320", "66750198", "61483322", "67546775", "75290893", "60713372", "77879142", "70290302", "83201853", "83837301")), .Names = c("Date", "Teams", "Scores"), row.names = c(NA, 12L), class = "data.frame") 

this is my command:

ggplot(sum, aes(x = Scores, y = Teams, group=1)) + geom_smooth(method = "lm", se=TRUE, size=2, formula = lm(Teams ~ Scores)) 

I get this error:

Error in eval(expr, envir, enclos) : object 'Teams' not found 

any ideas?

2
  • For any future reader of this thread - to call a data frame like one of the most used base R functions is not a good idea. On this occasion, may I remind that df is also a base R function (although less often used than sum) Commented Jan 24, 2021 at 11:54
  • Does this answer your question? Adding a regression line on a ggplot Commented Jan 24, 2021 at 12:02

2 Answers 2

1

If you want to specify the formula for, e.g., linear model, use y ~ poly(x, 1). You don't need to change the formula parameter as long as you want a simple linear regression (it's the default for method = "lm"):

ggplot(sum, aes(x = Scores, y = Teams, group = 1)) + geom_smooth(method = "lm", formula = y ~ poly(x, 1), se = TRUE, size = 2) 

I also would recommend using Scores as numeric values (as.numeric(Scores)) if you don't want this variable to be categorial. This would change the regression line.

Score as categorial variable:

categorial

Score as numeric variable:

numeric

Sign up to request clarification or add additional context in comments.

2 Comments

is there an easy way to print the r square value as a legend?
An easy way would be to print it in the the title. Just add opts(title = bquote(R^2 ~ ":" ~ .(summary(lm(Teams ~ as.numeric(Scores), sum))$r.squared))) to the plot.
1

Here's another option using stat_cor from the ggpubr package. This code will plot your points and display the correlation and p value. You can change "pearson" to "spearman" if you have non-normal data.

ggplot(sum, aes(x = Scores, y = Teams, group = 1)) + geom_point(aes()) + geom_smooth(method = "lm", se = TRUE, size = 2) + stat_cor(method = "pearson", cor.coef.name = "r", vjust = 1, size = 4) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.