8

In Excel, it's pretty easy to fit a logarithmic trend line of a given set of trend line. Just click add trend line and then select "Logarithmic." Switching to R for more power, I am a bit lost as to which function should one use to generate this.

To generate the graph, I used ggplot2 with the following code.

ggplot(data, aes(horizon, success)) + geom_line() + geom_area(alpha=0.3)+ stat_smooth(method='loess') 

But the code does local polynomial regression fitting which is based on averaging out numerous small linear regressions. My question is whether there is a log trend line in R similar to the one used in Excel.

An alternative I am looking for is to get an log equation in form y = (c*ln(x))+b; is there a coef() function to get 'c' and 'b'?

Let my data be:

c(0.599885189,0.588404133,0.577784156,0.567164179,0.556257176, 0.545350172,0.535112897,0.52449292,0.51540375,0.507271336,0.499904325, 0.498851894,0.498851894,0.497321087,0.4964600,0.495885955,0.494068121, 0.492154612,0.490145427,0.486892461,0.482395714,0.477229238,0.471010333) 

The above data are y-points while the x-points are simply integers from 1:length(y) in increment of 1. In Excel: I can simply plot this and add a logarithmic trend line and the result would look:

enter image description here

With black being the log. In R, how would one do this with the above dataset?

2
  • 1
    You can take the log of data using the log() function, and fit smoothers using the loess() function. Make a plot of the data using plot() and add (smoothed) lines to it using lines(). Commented Oct 14, 2012 at 4:23
  • is it possible to just get the equation instead? the equation in excel is y=(c*ln(x))+b Commented Oct 14, 2012 at 4:44

4 Answers 4

11

I prefer to use base graphics instead of ggplot2:

#some data with a linear model x <- 1:20 set.seed(1) y <- 3*log(x)+5+rnorm(20) #plot data plot(y~x) #fit log model fit <- lm(y~log(x)) #look at result and statistics summary(fit) #extract coefficients only coef(fit) #plot fit with confidence band matlines(x=seq(from=1,to=20,length.out=1000), y=predict(fit,newdata=list(x=seq(from=1,to=20,length.out=1000)), interval="confidence")) 

enter image description here

#some data with a non-linear model set.seed(1) y <- log(0.1*x)+rnorm(20,sd=0.1) #plot data plot(y~x) #fit log model fit <- nls(y~log(a*x),start=list(a=0.2)) #look at result and statistics summary(fit) #plot fit lines(seq(from=1,to=20,length.out=1000), predict(fit,newdata=list(x=seq(from=1,to=20,length.out=1000)))) 
Sign up to request clarification or add additional context in comments.

Comments

8

You can easily specify alternative smoothing methods (such as lm(), linear least-squares fitting) and an alternative formula

library(ggplot2) g0 <- ggplot(dat, aes(horizon, success)) + geom_line() + geom_area(alpha=0.3) g0 + stat_smooth(method="lm",formula=y~log(x),fill="red") 

The confidence bands are automatically included: I changed the color to make them visible since they're very narrow. You can use se=FALSE in stat_smooth to turn them off.

enter image description here

The other answer shows you how to get the coefficients:

coef(lm(success~log(horizon),data=dat)) 

I can imagine you might next want to add the equation to the graph: see Adding Regression Line Equation and R2 on graph

2 Comments

Is there are reason you used geom_area in addition to geom_line? I'm not sure that it makes the plot easy to understand
I reproduced it from the OP's example: I assumed they had it in there because they wanted it, but I guess it's possible that they just copied it from an example themselves ...
1

I'm pretty sure a simple +scale_y_log10() would get you what you wanted. GGPlot stats are calculated after transformations, so the loess() would then be calculated on the log transformed data.

Comments

1

I've just written a blog post here that describes how to match Excel's logarithmic curve fitting exactly. The nub of the approach centers around the lm() function:

# Set x and data.to.fit to the independent and dependent variables data.to.fit <- c(0.5998,0.5884,0.5777,0.5671,0.5562,0.5453,0.5351,0.524,0.515,0.5072,0.4999,0.4988,0.4988,0.4973,0.49,0.4958,0.4940,0.4921,0.4901,0.4868,0.4823,0.4772,0.4710) x <- c(seq(1, length(data.to.fit))) data.set <- data.frame(x, data.to.fit) # Perform a logarithmic fit to the data set log.fit <- lm(data.to.fit~log(x), data=data.set) # Print out the intercept, log(x) parameters, R-squared values, etc. summary(log.fit) # Plot the original data set plot(data.set) # Add the log.fit line with confidence intervals matlines(predict(log.fit, data.frame(x=x), interval="confidence")) 

Hope that helps.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.