2

I have thousands of files I need to do this for so I am trying to avoid doing this manually for each file. The only identifying characteristics in these files is the file name so I need to create a column based on the file name so they can be identified when I combine the files later. The file name contains a placeholder then the boat name then the net number separated by an underscore. My data looks like this:

file name = 3_Whip_1.1.csv (Boat = Whip, Net = 1.1)

Time Pred 11:00 10.2 12:00 8.4 13:00 9.6 

I would am trying to get the data to look like this:

Boat Net Time Pred Whip 1.1 11:00 10.2 Whip 1.1 12:00 8.4 Whip 1.1 13:00 9.6 

Any help would be greatly appreciated.

2 Answers 2

1

We can use gsub to return a substring of the 'filename' and split it into two columns with read.table and cbind with the original data

d1 <- read.table(text=gsub("^\\d+_|\\.[^.]+$", "", filename), sep="_", col.names = c("Boat", "Net")) cbind(d1, dat1) # Boat Net Time Pred #1 Whip 1.1 11:00 10.2 #2 Whip 1.1 12:00 8.4 #3 Whip 1.1 13:00 9.6 

data

dat1 <- structure(list(Time = c("11:00", "12:00", "13:00"), Pred = c(10.2, 8.4, 9.6)), .Names = c("Time", "Pred"), class = "data.frame", row.names = c(NA, -3L)) filename <- "3_Whip_1.1.csv" 
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @akrun This works great for splitting up the file name once I have extracted it. The part I am really stuck on is getting the actual file name into a data frame. Rather that just creating each data frame individually through, filename <- "3_Whip_1.1.csv".
@D.Chamberlin You can get the filenames from list.files() if it is in the working directory and read all the files in a list i.e. filenames <- list.files(pattern = ".csv"); lst <- lapply(filenames, function(x) read.csv(x, stringsAsFactors=FALSE); dat1 <- do.call(rbind, Map(cbind, filenames, lst)) and then split up the columns
@D.Chamberlin Fwiw, making a table to contain the file names and their metadata is another way to go: fileDF = data.frame(list.files(patt="csv$")); fileDF$id = 1:nrow(fileDF); fileDF$Boat = sub(...) etc. (An example with data.table franknarf1.github.io/r-tutorial/_book/… )
1

The following code will work for one data frame. You probably can create a function to include these operations and loop through (or use apply family function) a vector or list of your filenames. The list.files function can show all filenames in one directory, which could be useful for your work.

# Create the example filename filename <- "3_Whip_1.1.csv" # Create example data frame dat1 <- data.frame(Time = c("11:00", "12:00", "13:00"), Pred = c(10.2, 8.4, 9.6), stringsAsFactors = FALSE) # Remove ".csv" filename2 <- sub(".csv", "", filename) # Split the string by "_" filename_vec <- strsplit(filename2, split = "_")[[1]] # Create columns to store the information dat1$Boat <- filename_vec[2] dat1$Net <- filename_vec[3] # Change column order dat1 <- dat1[, c("Boat", "Net", "Time", "Pred")] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.