Replicate a procedure: generate m dataframes, each one with a variable that includes random values, and append them, in R

Question

Which is the most efficient way to generate m dataframes, where each one has a variable that includes random values, and append them?

Here's an example:

df <- data.frame(id = 1:10, var = sample(1:500), 10, replace=TRUE) id var 1 65 2 123 3 42 4 16 5 463 6 129 7 367 8 99 9 489 10 63

If m = 2, two dataframes should be generated and appended, having:

id var 1 65 2 123 3 42 4 16 5 463 6 129 7 367 8 99 9 489 10 63 1 321 2 410 3 78 4 166 5 320 6 478 7 231 8 100 9 105 10 206

Ronak Shah · Accepted Answer · 2024-05-10 11:35:15Z

Put the dataframe to be generated in a function

fun <- function() { df <- data.frame(id = 1:10, var = sample(1:500, 10, replace=TRUE)) df }

Then there are multiple ways to call this function m times and bind.

Base R using replicate

m <- 2 do.call(rbind, replicate(m, fun(), simplify = FALSE))

Base R using lapply

do.call(rbind, lapply(seq_len(m),\(x) fun()))

purrr::map_df

purrr::map_df(seq_len(m), ~fun())

jblood94 · Accepted Answer · 2024-05-10 12:51:56Z

The most efficient way to create a data.frame with those characteristics for an arbitrary m is to avoid appending by creating a single data.frame to begin with:

df <- data.frame(id = 1:10, var = sample(500, 10*m, 1))

Collectives™ on Stack Overflow

Replicate a procedure: generate m dataframes, each one with a variable that includes random values, and append them, in R

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related