1

Is it possible to vectorise the following function, (f)?

I have a vector x for which I want to maximise the output value of the function f by altering p.

But the function is quite slow as it is not vectorised in anyway and was wondering if there was a good way to do so. The idea is to parallelise this in the future, and also potentially use data.table to speed it up

my real data is significantly larger...so I'm providing a mock example....

# My mock data x <- data.frame(x=rep(c(rep(c(0.2,-0.2),4),0.2,0.2,-0.2,0.2),20)) # The function to optimise for f <- function(p,x){ # Generate columns before filling x$multiplier <- NA x$cumulative <- NA for(i in 1:nrow(x)){ # Going through each row systematically if(i==1){ # If first row do a slightly different set of commands x[i,'multiplier'] <- 1 * p x[i,'cumulative'] <- (x[i,'multiplier'] * x[i,'x']) + 1 } else { # For the rest of the rows carry out these commands x[i,'multiplier'] <- x[i-1,'cumulative'] * p x[i,'cumulative'] <- (x[i,'multiplier'] * x[i,'x']) + x[i-1,'cumulative'] } } # output the final row's output for the cumulative column as.numeric(x[nrow(x),'cumulative']) } # Checking the function works by putting in a test value of p = 0.5 f(0.5,x) # Now optimise the function between the interval of p between 0 and 1 optim.p <- optimise(f=f, interval=c(0,1),x, maximum=TRUE) # Viewing the output of optim.p optim.p 

1 Answer 1

10

(Edit - forgot the first part of the post I had written, putting it in now).

Your problem can be simplified by examining what your function f actually does. Since I'm lazy, I'm going to write x[i, 'multiplier'] as mi, x[i, 'cumulative'] as yi, and x[i, 'x'] as xi.

Let's look at your equation in f. We'll look at the case i > 1 first:

mi = yi-1 * p
yi = mi * xi + yi-1

Substitute m_i above:

yi = (yi-1 * p) * xi + yi-1 // let's factorise..
yi = yi-1 * (p * xi + 1)

This dispenses with the need to calculate the multipler column.

Now looking a little closer at your i == 1 case, we see that if we put y0 to 1, the following works out for all i = 1, ..., nrow(x):

yi = yi-1(pxi + 1) ---------- (1)

Looking at your function f, what you want to calculate is yn:

yn = yn-1(pxn + 1)

What happens if we substitute the formula for yn-1 in the above using (1)?

yn = yn-2(pxn-1 + 1)(pxn + 1)

Now we substitute in yn-2's formula in the above:

yn = yn-3(pxn-2 + 1)(pxn-1 + 1)(pxn + 1)

You get the pattern, right? We substitute all the way down to y1:

yn = y0(px1 + 1)(px2 + 1)...(pxn-1 + 1)(pxn + 1)

But remember, y0 is just 1. So, to calculate the value of f(x, p), we just do:

f(x, p) = (px1 + 1)(px2 + 1)...(pxn-1 + 1)(pxn + 1)

where n is nrow(x). That is, calculate p * x[i, 'x'] + 1 for each i and multiply them all together.


To multiply a vector of numbers together in R, you use prod. So, if x was just a vector:

f_version2 <- function(p, x) { return(prod(p * x + 1)) } 

Let's test it on a few things:

x <- rep(c(rep(c(0.2,-0.2),4),0.2,0.2,-0.2,0.2),20) > f(0.5, x) [1] 16.56635 > f_version2(0.5, x) [1] 16.56635 

In summary, sometimes you can achieve speedups simply by analysing the maths of the problem, as well as/opposed to the numerical implementation.

Sign up to request clarification or add additional context in comments.

1 Comment

+6.02*10^23 rating for that last sentence alone!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.