1

I've got several csv files that all have this format:

xyz site 2.1 tex1 15.67 tex2 32.111 ny31 

I want to import these files into R. Like this:

df_list <- list.files('./usa_data') dfs <- lapply(df_list, import_function) 

What I want the import_function to do is to take a part of the name of the csv file and paste it instead of the first column name (xyz). My csv names have this format:

usa_low_dollars_270_1.csv usa_high_euros_250_2.csv usa_low_gbp_240_1.csv 

I want to extract the currency (the third component of the name), combine it with the word 'median' and rename the first column like this:

dollars_median site 2.1 tex1 15.67 tex2 32.111 ny31 # or euros_median site 2.1 tex1 15.67 tex2 32.111 ny31 # etc 
2
  • Which of the two is exactly the problem? (1) getting the file names. (2) extracting the currency substring. Commented Jan 3, 2021 at 16:47
  • Hi Jan! The second one. I edited the question accordingly. Commented Jan 3, 2021 at 16:50

2 Answers 2

2

You can use lapply along the indices of df_list using seq_along instead of on the vector itself.

From there you can subset df_list with the index and assign the column names of the read in data using strsplit.

df_list <- list.files() df_list [1] "usa_high_euros_250_2.csv" "usa_low_dollars_270_1.csv" "usa_low_gbp_240_1.csv" lapply(seq_along(df_list), function(x){ data <- read.csv(df_list[x]) names(data)[1] <- str_c(strsplit(df_list[x],"_")[[1]][3],"_median") data }) [[1]] euros_median site 1 2.100 tex1 2 15.670 tex2 3 32.111 ny31 [[2]] dollars_median site 1 2.100 tex1 2 15.670 tex2 3 32.111 ny31 [[3]] gbp_median site 1 2.100 tex1 2 15.670 tex2 3 32.111 ny31 
Sign up to request clarification or add additional context in comments.

Comments

1

You can extract the dollar or the other currencies on the 3. place of the string, so:

sp <- stringr::str_split("usa_low_dollars_270_1.csv", '_')[[1]][3] sp [1] "dollars" 

you can then make your new name like so:

new <- paste0(sp, '_median') new [1] "dollars_median" 

With this you can exchange the colname.

2 Comments

it is also possible to get around the dependency to the stringr package with the base R function strsplit().
Thank you. How can I do this so I don't have to type in the name of each file (I have 30+ files), but rather that the function recognises the name of the csv? In the first line of your code, str_split

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.