0

You may be able to help me: For each ID, I wish to extract the largest "a" value that has the largest "b" value. In other words, I wish to scan through the "b" values, identify the highest (here b=40). If several "a" have the same highest "b" value (here a=20 and a=30), then I wish to select the highest "a" value (here a=30).

Here is what I have done so far:

df<- data.frame(ID=c('1','1','1','1','1','1'), a=c('10','20','30','10','2','30'), b=c('10','20','30','10','40', "40")) library(plyr) opt <- ddply(df,.(ID),summarise, a=a[which.max(b)]) opt ID a 1 2 

but, I don't get:

ID a 1 30 

I'd greatly appreciate your suggestions. Note that contrary to this sample dataset, the actual dataset I work on is pretty large. Thank you very much!

1
  • which.max gives you the index. For the value just use max. Commented Sep 24, 2018 at 9:32

1 Answer 1

2

We can use dplyr, arrange b and a in descending order by group (ID) and then get the first row of each group.

library(dplyr) df %>% group_by(ID) %>% arrange(desc(b), desc(a)) %>% slice(1) # ID a b # <fct> <fct> <fct> #1 1 30 40 

As shown in expected output , if we need only ID and a column we can just select them

df %>% group_by(ID) %>% arrange(desc(b), desc(a)) %>% slice(1) %>% select(ID, a) 

We can also arrange them in ascending order and then select last row using n()

library(dplyr) df %>% group_by(ID) %>% arrange(b, a) %>% slice(n()) %>% select(ID, a) 
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much for your time and support, it's working perfectly

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.