3

Given a (simplified) dataframe with format

df <- data.frame(a = c(1,2,3,4), b = c(4,3,2,1), temp1 = c("-","-","-","foo: 3"), temp2 = c("-","bar: 10","-","bar: 4") ) a b temp1 temp2 1 4 - - 2 3 - bar: 10 3 2 - - 4 1 foo: 3 bar: 4 

I need to rename all temp columns with the names contained within the column, My end goal is to end up with this:

a b foo bar 1 4 - - 2 3 - 10 3 2 - - 4 1 3 4 

the df column names and the data contained within them will be unknown, however the columns that need changing will contain temp and the delimiter will always be a ":"

As such I can easily remove the name from within the columns using dplyr like this:

df <- df %>% mutate_at(vars(contains("temp")), ~(substr(., str_locate(., ":")+1,str_length(.)))) 

but first I need to rename the columns based on some function method, that scans the column and returns the value(s) within it, ie.

rename_at(vars(contains("temp")), ~(...some function.....)) 

As per the example given there's no guarantee that specific rows will have data so I can't simply grab value from row 1

Any ideas welcome. Thanks in advance

4
  • Do you want your resulting foo/bar columns to be numeric, or stay as character columns? Commented Oct 19, 2019 at 23:07
  • How does your data get in that format? Commented Oct 20, 2019 at 0:36
  • The data is scraped and originally has 1 column called notes that contains within it multiple (repeated) entries (ie foo:x, foo: y, bar, z, etc). I actually split the possible categories into their respective individual temp columns by using the method supplied by Yannis here : <stackoverflow.com/questions/4350440/…>. This is the next and final step to wrangle the data into a usable format. Commented Oct 20, 2019 at 12:10
  • I want numeric, but I guess it doesn't matter as I can easily apply those changes myself. Thanks. Commented Oct 20, 2019 at 12:11

2 Answers 2

2

One possibility involving dplyr and tidyr could be:

df %>% pivot_longer(names_to = "variables", values_to = "values", -c(a:b)) %>% mutate(values = replace(values, values == "-", NA_character_)) %>% separate(values, into = c("variables2", "values"), sep = ": ") %>% group_by(variables) %>% fill(variables2, .direction = "downup") %>% ungroup() %>% select(-variables) %>% pivot_wider(names_from = "variables2", values_from = "values") a b foo bar <dbl> <dbl> <chr> <chr> 1 1 4 <NA> <NA> 2 2 3 <NA> 10 3 3 2 <NA> <NA> 4 4 1 3 4 

If you want to further replace the NAs with -:

df %>% pivot_longer(names_to = "variables", values_to = "values", -c(a:b)) %>% mutate(values = replace(values, values == "-", NA_character_)) %>% separate(values, into = c("variables2", "values"), sep = ": ") %>% group_by(variables) %>% fill(variables2, .direction = "downup") %>% ungroup() %>% select(-variables) %>% pivot_wider(names_from = "variables2", values_from = "values") %>% mutate_at(vars(-a, -b), ~ replace_na(., "-")) a b foo bar <dbl> <dbl> <chr> <chr> 1 1 4 - - 2 2 3 - 10 3 3 2 - - 4 4 1 3 4 
Sign up to request clarification or add additional context in comments.

Comments

0

This will do the job:

colnames(df)[which(grepl("temp", colnames(df)))] <- unique(unlist(sapply(df[,grepl("temp", colnames(df))], function(x){gsub("[:].*", "", grep("\\w+", x, value = TRUE))}))) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.