3

I have a data.frame:

set.seed(1) short.df <- data.frame(id=letters[1:10],name=LETTERS[1:10]) 

And I want to replicate each row by a number of times given by a vector whose length equals nrow(short.df):

lengths <- c(sample(10000,10,replace=F)) 

This takes too long for my real data size:

long.df <- do.call(rbind,lapply(1:length(lengths),function(x) data.frame(id=rep(short.df$id,lengths[x]),name=rep(short.df$name[x],lengths[x])))) 

Any way to do it faster?

0

1 Answer 1

4

You can replicate the rows by using rep() in the i argument of [.data.frame.

long.df <- short.df[rep(1:nrow(short.df), lengths), ] 

Check:

identical(nrow(long.df), sum(lengths)) # [1] TRUE 

The new row names may not be desirable, but those are easy to change.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.