I have written a (fairly naive) function to randomly select a date/time between two specified days
# set start and end dates to sample between day.start <- "2012/01/01" day.end <- "2012/12/31" # define a random date/time selection function rand.day.time <- function(day.start,day.end,size) { dayseq <- seq.Date(as.Date(day.start),as.Date(day.end),by="day") dayselect <- sample(dayseq,size,replace=TRUE) hourselect <- sample(1:24,size,replace=TRUE) minselect <- sample(0:59,size,replace=TRUE) as.POSIXlt(paste(dayselect, hourselect,":",minselect,sep="") ) } Which results in:
> rand.day.time(day.start,day.end,size=3) [1] "2012-02-07 21:42:00" "2012-09-02 07:27:00" "2012-06-15 01:13:00" But this seems to be slowing down considerably as the sample size ramps up.
# some benchmarking > system.time(rand.day.time(day.start,day.end,size=100000)) user system elapsed 4.68 0.03 4.70 > system.time(rand.day.time(day.start,day.end,size=200000)) user system elapsed 9.42 0.06 9.49 Is anyone able to suggest how to do something like this in a more efficient manner?