1

I'm trying to concatenate a string that identifies the order of the columns by their value.

set.seed(100) df <- tibble(id = 1:5, col1 = sample(1:50, 5), col2 = sample(1:50, 5), col3 = sample(1:50, 5)) %>% mutate_at(vars(-id), ~if_else(. <= 20, NA_integer_, .)) # A tibble: 5 x 4 id col1 col2 col3 <int> <int> <int> <int> 1 1 NA 44 NA 2 2 38 23 34 3 3 48 22 NA 4 4 25 NA 48 5 5 NA NA 43 res <- df %>% add_column(order = c('col2', 'col2_col3_co1', 'col2_col1', 'col1_col3', 'col3')) # A tibble: 5 x 5 id col1 col2 col3 order <int> <int> <int> <int> <chr> 1 1 NA 44 NA col2 2 2 38 23 34 col2_col3_co1 3 3 48 22 NA col2_col1 4 4 25 NA 48 col1_col3 5 5 NA NA 43 col3 

My current data is in the form of df while the column I'm trying to add is the order column in res. The ordering of the elements in the string is determined by the value of each column, and it also needs to skip over NAs. I'm trying to identify the sequence that each ID takes to populate a value in each column as the values are time in days. However, not all IDs will have a value in all columns, so there's missing values throughout. I usually work within tidyverse, but any solution or thoughts would be much appreciated.

1 Answer 1

1

An easier option is apply, loop over the rows (MARGIN = 1), remove the NA elements, order the rest of the non-NA, use the index to get the column names and paste them together

df$order <- apply(df[-1], 1, function(x) {x1 <- x[!is.na(x)] paste(names(x1)[order(x1)], collapse="_")}) df$order #[1] "col2" "col2_col3_col1" "col2_col1" "col1_col3" "col3" 

Or using tidyverse

library(dplyr) library(tidyr) library(stringr) df %>% pivot_longer(cols = -id, values_drop_na = TRUE) %>% arrange(id, value) %>% group_by(id) %>% summarise(order = str_c(name, collapse="_")) %>% right_join(df) %>% select(names(df), order) # A tibble: 5 x 5 # id col1 col2 col3 order # <int> <int> <int> <int> <chr> #1 1 NA 44 NA col2 #2 2 38 23 34 col2_col3_col1 #3 3 48 22 NA col2_col1 #4 4 25 NA 48 col1_col3 #5 5 NA NA 43 col3 

Or using pmap from purrr

library(purrr) df %>% mutate(order = pmap_chr(select(., starts_with('col')), ~ {x <- c(...) x1 <- x[!is.na(x)] str_c(names(x1)[order(x1)], collapse="_")})) 
Sign up to request clarification or add additional context in comments.

3 Comments

This doesn't handle NA correctly, it returns 1 for the first row. try names(x)[order(x, na.last = NA)]
@astrofunkswag yes, thanks, I had a typo. corrected
Thank you for the multiple solutions. The apply approach is certainly more straightforward which isn't surprising as rowwise operations tend to seem more complicated than they should be with the tidyverse. I figured that purrr::pmap would be able to accomplish this well, so thank you for the example.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.