Convert list of vectors to data frame

Question

I'm trying to convert a list of vectors (a multidimensional array essentially) into a data frame, but every time I try I'm getting unexpected results.

My aim is to instantiate a blank list, populate it in a for loop with vectors containing information about that iteration of the loop, then convert it into a data frame after it's finished.

> vectorList <- list() > for(i in 1:5){ + vectorList[[i]] <- c("number" = i, "square root" = sqrt(i)) + } > vectorList

Outputs:

> [[1]] > number square root > 1 1 > > [[2]] > number square root > 2.000000 1.414214 > > [[3]] > number square root > 3.000000 1.732051 > > [[4]] > number square root > 4 2 > > [[5]] > number square root > 5.000000 2.236068

Now I want this to become a data frame with 5 observations of 2 variables, but trying to create a data frame from 'vectorList'

numbers <- data.frame(vectorList)

results in 2 observations of 5 variables.

Weirdly it won't even be coerced with reshape2 (which I know would be an awful work around, but I tried).

Anyone got any insight?

Just a general note about your approach: you should not grow lists like this inside a for loop, if you can avoid it. When you add something to the end of a list, R has to copy the whole list. This is fine for small cases, but if your list is big (and it's getting bigger and bigger, in your case) this can be quite inefficient. — Taylor H
– Taylor H, Commented Apr 27, 2017 at 15:55
For your data construction, you could have used lapply like this: vectorList <- lapply(1:5, function(x) c(x, sqrt(x))). — lmo
– lmo, Commented Jul 6, 2017 at 16:55

h3rm4n · Accepted Answer · 2017-11-13 15:37:20Z

69

You can use:

as.data.frame(do.call(rbind, vectorList))

Or:

library(data.table) rbindlist(lapply(vectorList, as.data.frame.list))

Or:

library(dplyr) bind_rows(lapply(vectorList, as.data.frame.list))

edited Nov 13, 2017 at 15:37

answered Apr 27, 2017 at 15:52

h3rm4n

4,20717 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

PM0087 Over a year ago

The first one returns a warning:

Warning message: In (function (..., deparse.level = 1) : number of columns of result is not a multiple of vector length (arg 3)

The second one and the third return the error:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0

h3rm4n Over a year ago

@PM0087 It works perfectly fine for me. Did you use the data as in the question?

Giuseppe · Accepted Answer · 2017-12-15 07:21:04Z

The fastest and most efficient way that I know is using the data.table::transpose function (if the length of your vector is low-dimensional):

as.data.frame(data.table::transpose(vectorList), col.names = names(vectorList[[1]]))

However, you will need to set the column names manually as data.table::transpose removes them. There is also a purrr::transpose function that does not remove the column names but it seems to be slower. Below a small benchmark including the suggestions of the other users:

vectorList = lapply(1:1000, function(i) (c("number" = i, "square root" = sqrt(i)))) bench = microbenchmark::microbenchmark( dplyr = dplyr::bind_rows(lapply(vectorList, as.data.frame.list)), rbindlist = data.table::rbindlist(lapply(vectorList, as.data.frame.list)), Reduce = Reduce(rbind, vectorList), transpose_datatable = as.data.frame(data.table::transpose(vectorList), col.names = names(vectorList[[1]])), transpose_purrr = data.table::as.data.table(purrr::transpose(vectorList)), do.call = as.data.frame(do.call(rbind, vectorList)), times = 10) bench # Unit: microseconds # expr min lq mean median uq max neval cld # dplyr 286963.036 292850.136 320345.1137 310159.7380 341654.619 385399.851 10 b # rbindlist 285830.750 289935.336 306120.7257 309581.1895 318131.031 324217.413 10 b # Reduce 8573.474 9073.649 12114.5559 9632.1120 11153.511 33446.353 10 a # transpose_datatable 372.572 424.165 500.8845 479.4990 532.076 701.822 10 a # transpose_purrr 539.953 590.365 672.9531 671.1025 718.757 911.343 10 a # do.call 452.915 537.591 562.9144 570.0825 592.334 641.958 10 a # now use bigger list and disregard the slowest vectorList = lapply(1:100000, function(i) (c("number" = i, "square root" = sqrt(i)))) bench.big = microbenchmark::microbenchmark( transpose_datatable = as.data.frame(data.table::transpose(vectorList), col.names = names(vectorList[[1]])), transpose_purrr = data.table::as.data.table(purrr::transpose(vectorList)), do.call = as.data.frame(do.call(rbind, vectorList)), times = 10) bench.big # Unit: milliseconds # expr min lq mean median uq max neval cld # transpose_datatable 3.470901 4.59531 4.551515 4.708932 4.873755 4.91235 10 a # transpose_purrr 61.007574 62.06936 68.634732 65.949067 67.477948 97.39748 10 b # do.call 97.680252 102.04674 115.669540 104.983596 138.193644 151.30886 10 c

989 · Accepted Answer · 2017-04-27 16:07:32Z

14

Also Reduce:

Reduce(rbind, vectorList) # number square root # init 1 1.000000 # 2 1.414214 # 3 1.732051 # 4 2.000000 # 5 2.236068

answered Apr 27, 2017 at 16:07

989

13k6 gold badges35 silver badges57 bronze badges

1 Comment

lmo Over a year ago

Note that Reduce(rbind, vectorList) returns a matrix, so you'd want to wrap it in data.frame to return a data.frame object.

Artem Sokolov · Accepted Answer · 2018-08-28 17:06:27Z

An alternative solution using purrr:

purrr::map_dfr( vectorList, as.list ) # # A tibble: 5 x 2 # number `square root` # <dbl> <dbl> # 1 1 1 # 2 2 1.41 # 3 3 1.73 # 4 4 2 # 5 5 2.24

The code effectively converts each vector to a list and concatenates the results row-wise into a common data frame.

the great thing about the tidyverse methods (both dplyr::bind_rows() and purrr::map_dfr()) is that they can deal with list elements that have different lengths, and named vectors that vary in their order from element to element. Very useful for example when converting the output of xml2::xml_attrs() into rectangular data.

Collectives™ on Stack Overflow

Convert list of vectors to data frame

4 Answers 4

2 Comments

Comments

1 Comment

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

1 Comment

1 Comment

Linked

Related