How to sum over diagonals of data frame

Question

Say that I have this data frame:

 1 2 3 4 100 8 12 5 14 99 1 6 4 3 98 2 5 4 11 97 5 3 7 2

In this above data frame, the values indicate counts of how many observations take on (100, 1), (99, 1), etc.

In my context, the diagonals have the same meanings:

 1 2 3 4 100 A B C D 99 B C D E 98 C D E F 97 D E F G

How would I sum across the diagonals (i.e., sum the counts of the like letters) in the first data frame?

This would produce:

group sum A 8 B 13 C 13 D 28 E 10 F 18 G 2

For example, D is 5+5+4+14

Is this a matrix or a data.frame? (A matrix is easier to perform this on) — David Robinson
– David Robinson, Commented Apr 29, 2015 at 23:52
data.frame, but converting it to a matrix and back to a data.frame as in @Ben Bolker's answer does the trick. — bill999
– bill999, Commented Apr 30, 2015 at 0:00

Ben Bolker · Accepted Answer · 2015-04-30 02:18:11Z

You can use row() and col() to identify row/column relationships.

m <- read.table(text=" 1 2 3 4 100 8 12 5 14 99 1 6 4 3 98 2 5 4 11 97 5 3 7 2") vals <- sapply(2:8, function(j) sum(m[row(m)+col(m)==j]))

or (as suggested in comments by ?@thelatemail)

vals <- sapply(split(as.matrix(m), row(m) + col(m)), sum) data.frame(group=LETTERS[seq_along(vals)],sum=vals)

or (@Frank)

data.frame(vals = tapply(as.matrix(m), (LETTERS[row(m) + col(m)-1]), sum))

as.matrix() is required to make split() work correctly ...

What is the logic for why one needs to convert it to a matrix (instead of leaving it in data.frame) in order to do this?
@BenBolker - row and col work on all "matrix-like" objects with 2 dimensions incl. matrices, data.frames, tables etc.
Another very similar one: data.frame(vals = tapply(as.matrix(m), (LETTERS[row(m) + col(m)-1]), sum))

thelatemail · Accepted Answer · 2015-04-30 00:40:19Z

Another aggregate variation, avoiding the formula interface, which actually complicates matters in this instance:

aggregate(list(Sum=unlist(dat)), list(Group=LETTERS[c(row(dat) + col(dat))-1]), FUN=sum) # Group Sum #1 A 8 #2 B 13 #3 C 13 #4 D 28 #5 E 10 #6 F 18 #7 G 2

cryo111 · Accepted Answer · 2021-09-06 18:09:36Z

Another solution using bgoldst's definition of df1 and df2

sapply(unique(c(as.matrix(df2))), function(x) sum(df1[df2 == x]))

Gives

#A B C D E F G #8 13 13 28 10 18 2

(Not quite the format that you wanted, but maybe it's ok...)

Forgot to mention that my solution assumes that you have set options(stringsAsFactors=FALSE).

bgoldst · Accepted Answer · 2015-04-29 23:57:12Z

Here's a solution using stack(), and aggregate(), although it requires the second data.frame contain character vectors, as opposed to factors (could be forced with lapply(df2,as.character)):

df1 <- data.frame(a=c(8,1,2,5), b=c(12,6,5,3), c=c(5,4,4,7), d=c(14,3,11,2) ); df2 <- data.frame(a=c('A','B','C','D'), b=c('B','C','D','E'), c=c('C','D','E','F'), d=c('D','E','F','G'), stringsAsFactors=F ); aggregate(sum~group,data.frame(sum=stack(df1)[,1],group=stack(df2)[,1]),sum); ## group sum ## 1 A 8 ## 2 B 13 ## 3 C 13 ## 4 D 28 ## 5 E 10 ## 6 F 18 ## 7 G 2

Collectives™ on Stack Overflow

How to sum over diagonals of data frame

4 Answers 4

3 Comments

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

Comments

1 Comment

Comments

Linked

Related