1

I have a dataframe that contains a rating column.

I would like to sort using a custom order, e.g. AAA, AA, A, BBB, BB, B

However, default R sorting (using dplyr::arrange) results in A AA AAA B BB BBB

data.frame(Rating=c('AAA','AA','A','B','BB','BBB'), Value1=c(1,2,3,4,5,6), Value2=c(2,3,4,5,3,2)) %>% arrange(Rating) 

I found many links referring to same problem but they are not related to dataframe eg customize the sort function in R

How can I sort my data using a custom order?

3 Answers 3

2

If you want to keep the column as character rather than factor, you can arrange based on a match to the order vector

rating_order <- c("AAA", "AA", "A", "BBB", "BB", "B") df <- data.frame( Rating = c("A", "AA", "AAA", "B", "BB", "BBB"), Value1 = c(1, 2, 3, 4, 5, 6), Value2 = c(2, 3, 4, 5, 3, 2) ) library(dplyr, warn.conflicts = FALSE) df %>% arrange(match(Rating, rating_order)) #> Rating Value1 Value2 #> 1 AAA 3 4 #> 2 AA 2 3 #> 3 A 1 2 #> 4 BBB 6 2 #> 5 BB 5 3 #> 6 B 4 5 

Created on 2022-01-20 by the reprex package (v2.0.1)

Sign up to request clarification or add additional context in comments.

Comments

1

Here is one approach using dplyr. In short, first sort by the letter grade, and then again by the number of letters. This would not work for ratings such as AAB, but from what I gather from your example this isn't the case.

library(dplyr) data.frame(Rating=c('AAA','AA','A','B','BB','BBB'), Value1=c(1,2,3,4,5,6), Value2=c(2,3,4,5,3,2)) %>% mutate(grade = substr(Rating, 1,1), # Create a column with letter grade count = nchar(Rating)) %>% # Create a column with number of letters arrange(grade, count) %>% # Sort by grade, then count select(-grade, count) # Optional, removes intermediary columns #> Rating Value1 Value2 count #> 1 A 3 4 1 #> 2 AA 2 3 2 #> 3 AAA 1 2 3 #> 4 B 4 5 1 #> 5 BB 5 3 2 #> 6 BBB 6 2 3 

Created on 2022-01-20 by the reprex package (v0.3.0)

Comments

1

If you know the complete list of possible ratings, the easiest way is to make it a factor with the values in order, e.g.

library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union data.frame(Rating=factor(c('A', 'AA','AAA','B','BB','BBB'), levels=c('AAA','AA','A','BBB','BB','B')), Value1=c(1,2,3,4,5,6), Value2=c(2,3,4,5,3,2)) %>% arrange(Rating) #> Rating Value1 Value2 #> 1 AAA 3 4 #> 2 AA 2 3 #> 3 A 1 2 #> 4 BBB 6 2 #> 5 BB 5 3 #> 6 B 4 5 

Created on 2022-01-20 by the reprex package (v2.0.1)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.