Get file name and assign it as a column name in a function

Question

I've got several csv files that all have this format:

xyz site 2.1 tex1 15.67 tex2 32.111 ny31

I want to import these files into R. Like this:

df_list <- list.files('./usa_data') dfs <- lapply(df_list, import_function)

What I want the import_function to do is to take a part of the name of the csv file and paste it instead of the first column name (xyz). My csv names have this format:

usa_low_dollars_270_1.csv usa_high_euros_250_2.csv usa_low_gbp_240_1.csv

I want to extract the currency (the third component of the name), combine it with the word 'median' and rename the first column like this:

dollars_median site 2.1 tex1 15.67 tex2 32.111 ny31 # or euros_median site 2.1 tex1 15.67 tex2 32.111 ny31 # etc

Which of the two is exactly the problem? (1) getting the file names. (2) extracting the currency substring. — Jan
– Jan, Commented Jan 3, 2021 at 16:47

Ian Campbell · Accepted Answer · 2021-01-03 17:07:59Z

You can use lapply along the indices of df_list using seq_along instead of on the vector itself.

From there you can subset df_list with the index and assign the column names of the read in data using strsplit.

df_list <- list.files() df_list [1] "usa_high_euros_250_2.csv" "usa_low_dollars_270_1.csv" "usa_low_gbp_240_1.csv" lapply(seq_along(df_list), function(x){ data <- read.csv(df_list[x]) names(data)[1] <- str_c(strsplit(df_list[x],"_")[[1]][3],"_median") data }) [[1]] euros_median site 1 2.100 tex1 2 15.670 tex2 3 32.111 ny31 [[2]] dollars_median site 1 2.100 tex1 2 15.670 tex2 3 32.111 ny31 [[3]] gbp_median site 1 2.100 tex1 2 15.670 tex2 3 32.111 ny31

MarBlo · Accepted Answer · 2021-01-03 16:53:03Z

You can extract the dollar or the other currencies on the 3. place of the string, so:

sp <- stringr::str_split("usa_low_dollars_270_1.csv", '_')[[1]][3] sp [1] "dollars"

you can then make your new name like so:

new <- paste0(sp, '_median') new [1] "dollars_median"

With this you can exchange the colname.

it is also possible to get around the dependency to the stringr package with the base R function strsplit().
Thank you. How can I do this so I don't have to type in the name of each file (I have 30+ files), but rather that the function recognises the name of the csv? In the first line of your code, str_split

Collectives™ on Stack Overflow

Get file name and assign it as a column name in a function

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related