Rename dynamically using a dataframe name with dplyr

Question

In this example, I am using the iris dataset and I would like to rename Petal.Length as iris:

library(dplyr) some_fun <- function(x){ head(x) %>% rename(!!quo_name(x) := "Petal.Length") } some_fun(iris)

But this gives the following error:

Error: `expr` must quote a symbol, scalar, or call

If I use enquo instead of quo_name, I have this error:

Error: The LHS of `:=` must be a string or a symbol

I guess the problem comes from the fact that I call some_fun(iris) and not some_fun("iris"), but I have to call some_fun(iris).

How can I do that, while using some_fun(iris)?

Edit: I need this function to run through a list using purrr::map(). Updated example:

library(dplyr) library(purrr) list_df <- list(mtcars2 = mtcars %>% mutate(Petal.Length = 1), iris2 = iris) some_fun <- function(x){ df_name <- deparse(substitute(x)) head(x) %>% rename("{df_name}" := "Petal.Length") } test <- map(list_df, some_fun) list2env(test, .GlobalEnv) mtcars2 iris2

is there a reason you want to rename a variable to the same name as the dataset? Thats just going to cause a lot of problems in this case — user63230
– user63230, Commented Jul 1, 2020 at 15:19
@user63230 I have a list of datasets that contain a column country and a column value. After I clean these datasets with a function (call it clean() for example), I merge them in a single dataset. Since all these datasets contain a column value, this may cause some problems when I merge all of them. Therefore, I would like to rename the column value with the name of each dataset, so that I can distinguish the columns once I have a single merged dataset. Therefore, my plan was to include the step of rename in the function clean(). — bretauv
– bretauv, Commented Jul 1, 2020 at 15:25
i get you now, I think you can skip your approach, see post below — user63230
– user63230, Commented Jul 1, 2020 at 16:08

R me matey · Accepted Answer · 2020-07-01 17:47:15Z

Try getting the data set's name using deparse(substitute()), then use dplyr's new curly brackets for non-standard evaluation:

library(dplyr) some_fun <- function(x){ df_name <- deparse(substitute(x)) #Comes out as string of df's name head(x) %>% rename("{df_name}" := "Petal.Length") #df_name is evaluated, THEN becomes the new variable name for Petal.Length } some_fun(iris)

Basically everything within the curly brackets is evaluated first.

EDIT: Here's an update based on OPs update. Just extract the names beforehand, then pass them through the (slightly updated) function.

library(dplyr) library(purrr) list_df <- list(mtcars2 = mtcars %>% mutate(Petal.Length = 1), iris2 = iris) df_names <- names(list_df) some_fun <- function(x, x_name){ df_name <- x_name head(x) %>% rename("{df_name}" := "Petal.Length") } test <- map2(list_df, df_names, some_fun) list2env(test, .GlobalEnv) mtcars2 # mpg cyl disp hp drat wt qsec vs am gear carb mtcars2 #1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 1 #2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 1 #3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1 #4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 1 #5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 1 #6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 1 iris2 # Sepal.Length Sepal.Width iris2 Petal.Width Species #1 5.1 3.5 1.4 0.2 setosa #2 4.9 3.0 1.4 0.2 setosa #3 4.7 3.2 1.3 0.2 setosa #4 4.6 3.1 1.5 0.2 setosa #5 5.0 3.6 1.4 0.2 setosa #6 5.4 3.9 1.7 0.4 setosa

This works in this example, but if I want to run this function through a list using purrr::map(), it produces .x[[i]] as column name. See the update in my post

RyanFrost · Accepted Answer · 2020-07-01 16:06:56Z

Here are another few methods that I think could be useful to you, based on the information added in your comments.

Starting with a named list:

library(purrr) library(dplyr) countries <- c("ABC", "DEF", "GHI", "JKL", "MNO") df1 <- data.frame(country = countries, value = 1:5) df2 <- data.frame(country = countries, value = 6:10) df_list <- list(df1 = df1, df2 = df2) df_list #> $df1 #> country value #> 1 ABC 1 #> 2 DEF 2 #> 3 GHI 3 #> 4 JKL 4 #> 5 MNO 5 #> #> $df2 #> country value #> 1 ABC 6 #> 2 DEF 7 #> 3 GHI 8 #> 4 JKL 9 #> 5 MNO 10

We can use purrr's imap to use the names of each element to rename that element's 'value' column:

df_list %>% imap(~ .x %>% rename("{.y}" := value)) #> $df1 #> country df1 #> 1 ABC 1 #> 2 DEF 2 #> 3 GHI 3 #> 4 JKL 4 #> 5 MNO 5 #> #> $df2 #> country df2 #> 1 ABC 6 #> 2 DEF 7 #> 3 GHI 8 #> 4 JKL 9 #> 5 MNO 10

However, there's another way to merge these datasets that may be preferable if all of the 'value' columns are the same type.

In this case, we can use dplyr's bind_rows with the .id parameter to add an identifier column in the merged dataset. This way all of the values are in the same column, but we can still tell which source they came from.

df_list %>% bind_rows(.id = "df") #> df country value #> 1 df1 ABC 1 #> 2 df1 DEF 2 #> 3 df1 GHI 3 #> 4 df1 JKL 4 #> 5 df1 MNO 5 #> 6 df2 ABC 6 #> 7 df2 DEF 7 #> 8 df2 GHI 8 #> 9 df2 JKL 9 #> 10 df2 MNO 10

^{Created on 2020-07-01 by the reprex package (v0.3.0)}

Using imap is perfect, I upvote this but can't mark it as a solution since it doesn't really answer the question but offers an alternative

user63230 · Accepted Answer · 2020-07-01 16:06:43Z

I think you can skip this by using bind_rows with .id which adds the df name as a column in your merge:

library(tidyverse) df1 <- data.frame(a = c(1, 2), b = c(1, 2)) df2 <- data.frame(a = c(1, 2), b = c(1, 2)) df_list <- lst(df1, df2) dplyr::bind_rows(df_list, .id = "df_name") # df_name a b # 1 df1 1 1 # 2 df1 2 2 # 3 df2 1 1 # 4 df2 2 2

Collectives™ on Stack Overflow

Rename dynamically using a dataframe name with dplyr

3 Answers 3

1 Comment

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

Comments

Related