Vectorising a for loop

Question

Is it possible to vectorise the following function, (f)?

I have a vector x for which I want to maximise the output value of the function f by altering p.

But the function is quite slow as it is not vectorised in anyway and was wondering if there was a good way to do so. The idea is to parallelise this in the future, and also potentially use data.table to speed it up

my real data is significantly larger...so I'm providing a mock example....

# My mock data x <- data.frame(x=rep(c(rep(c(0.2,-0.2),4),0.2,0.2,-0.2,0.2),20)) # The function to optimise for f <- function(p,x){ # Generate columns before filling x$multiplier <- NA x$cumulative <- NA for(i in 1:nrow(x)){ # Going through each row systematically if(i==1){ # If first row do a slightly different set of commands x[i,'multiplier'] <- 1 * p x[i,'cumulative'] <- (x[i,'multiplier'] * x[i,'x']) + 1 } else { # For the rest of the rows carry out these commands x[i,'multiplier'] <- x[i-1,'cumulative'] * p x[i,'cumulative'] <- (x[i,'multiplier'] * x[i,'x']) + x[i-1,'cumulative'] } } # output the final row's output for the cumulative column as.numeric(x[nrow(x),'cumulative']) } # Checking the function works by putting in a test value of p = 0.5 f(0.5,x) # Now optimise the function between the interval of p between 0 and 1 optim.p <- optimise(f=f, interval=c(0,1),x, maximum=TRUE) # Viewing the output of optim.p optim.p

mathematical.coffee · Accepted Answer · 2012-11-05 09:02:41Z

(Edit - forgot the first part of the post I had written, putting it in now).

Your problem can be simplified by examining what your function f actually does. Since I'm lazy, I'm going to write x[i, 'multiplier'] as m_i, x[i, 'cumulative'] as y_i, and x[i, 'x'] as x_i.

Let's look at your equation in f. We'll look at the case i > 1 first:

m_i = y_i-1 * p
y_i = m_i * x_i + y_i-1

Substitute m_i above:

y_i = (y_i-1 * p) * x_i + y_i-1 // let's factorise..
y_i = y_i-1 * (p * x_i + 1)

This dispenses with the need to calculate the multipler column.

Now looking a little closer at your i == 1 case, we see that if we put y₀ to 1, the following works out for all i = 1, ..., nrow(x):

y_i = y_i-1(px_i + 1) ---------- (1)

Looking at your function f, what you want to calculate is y_n:

y_n = y_n-1(px_n + 1)

What happens if we substitute the formula for y_n-1 in the above using (1)?

y_n = y_n-2(px_n-1 + 1)(px_n + 1)

Now we substitute in y_n-2's formula in the above:

y_n = y_n-3(px_n-2 + 1)(px_n-1 + 1)(px_n + 1)

You get the pattern, right? We substitute all the way down to y₁:

y_n = y₀(px₁ + 1)(px₂ + 1)...(px_n-1 + 1)(px_n + 1)

But remember, y₀ is just 1. So, to calculate the value of f(x, p), we just do:

f(x, p) = (px₁ + 1)(px₂ + 1)...(px_n-1 + 1)(px_n + 1)

where n is nrow(x). That is, calculate p * x[i, 'x'] + 1 for each i and multiply them all together.

To multiply a vector of numbers together in R, you use prod. So, if x was just a vector:

f_version2 <- function(p, x) { return(prod(p * x + 1)) }

Let's test it on a few things:

x <- rep(c(rep(c(0.2,-0.2),4),0.2,0.2,-0.2,0.2),20) > f(0.5, x) [1] 16.56635 > f_version2(0.5, x) [1] 16.56635

In summary, sometimes you can achieve speedups simply by analysing the maths of the problem, as well as/opposed to the numerical implementation.

Collectives™ on Stack Overflow

Vectorising a for loop

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related