R: Find max value for column among a subset of a data frame

Question

I have a dataframe df with columns ID, Year, Value1, Value2, Value3 and 21788928 rows. I need to subset the data by Year and IDand find the max Value1 in that subset saving the rest of information of that row, I need to do that for all the combinations of Year and ID(Year goes from 1982 to 2013, ID is from 1 to 28371)

I was trying to do that in a double for loop:

year<-seq(1982, 2013) cnt=1 for (i in 1:32) { for (j in 1:28371) A<-df[df$Year==year[i]&df$ID==j,] maxVal[cnt,]<-A[A$Value1==max(A$Value1),] cnt=cnt+1 } }

but it takes way to long. Is there a more efficient way to do that? Maybe using ddply or with.

Please post a sample of the data, and the desired result

Rich Scriven
– Rich Scriven

2015-08-01 15:50:29 +00:00
Commented Aug 1, 2015 at 15:50 — Rich Scriven
– Rich Scriven, Commented Aug 1, 2015 at 15:50

SabDeM · Accepted Answer · 2015-08-01 16:24:27Z

5

A base R solution with aggregate:

prov <- aggregate(. ~ Year + ID, data = dat, FUN = max)

answered Aug 1, 2015 at 16:24

SabDeM

7,2103 gold badges28 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

SabDeM Over a year ago

@Liza you are welcome. Consider to accept the answer that fits your goals and next time if you have to add clarifications please do edit your post and do not post an answer (unless of course you find by yourself the solution for your problems )

Rorschach · Accepted Answer · 2015-08-01 16:08:58Z

You can use dplyr

library(dplyr) dat %>% group_by(ID, Year) %>% summarise(mval=max(Value1)) -> result

or plyr, keeping all the other columns (and repeating max Value1 as mval)

ddply(dat, .(ID, Year), function(x) { transform(x[which.max(x$Value1),], mval=Value1) }, .drop=F)

Data

dat <- data.frame(ID=sample(1:10, 100, rep=T), Year=sample(1995:2000, 100, rep=T), Value1=runif(100))

Or library(data.table);setDT(dat)[, list(mval= max(Value1)) , .(ID, Year)]

Collectives™ on Stack Overflow

R: Find max value for column among a subset of a data frame

2 Answers 2

1 Comment

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Related