0

I'm working with {tidyverse} in R and I would like to do something that is somewhat complicated.

> col_vict %>% + select(alcohol_involved, victim_degree_of_injury) %>% + mutate(alcohol_involved = as.factor(ifelse(is.na(alcohol_involved), "NO", "YES"))) %>% + table() %>% + as.data.table() %>% + group_by(victim_degree_of_injury) # A tibble: 10 x 3 # Groups: victim_degree_of_injury [5] alcohol_involved victim_degree_of_injury N <chr> <chr> <int> 1 NO complaint of pain 16516 2 YES complaint of pain 1331 3 NO killed 168 4 YES killed 122 5 NO no injury 22860 6 YES no injury 1905 7 NO other visible injury 4778 8 YES other visible injury 1102 9 NO severe injury 752 10 YES severe injury 315 

I would like to represent the ratio of the N of victim_degree_of_injury where alcohol_involved == YES divided by the N of victim_degree_of_injury where alcohol_involved == NO.

Here's the dput() of what I was working with:

structure(list(alcohol_involved = c("NO", "YES", "NO", "YES", "NO", "YES", "NO", "YES", "NO", "YES"), victim_degree_of_injury = c("complaint of pain", "complaint of pain", "killed", "killed", "no injury", "no injury", "other visible injury", "other visible injury", "severe injury", "severe injury"), N = c(16516L, 1331L, 168L, 122L, 22860L, 1905L, 4778L, 1102L, 752L, 315L)), class = "data.frame", row.names = c(NA, -10L)) 
0

2 Answers 2

1
library(dplyr) df %>% group_by(victim_degree_of_injury) %>% summarize(ratio = N[alcohol_involved == "YES"] / N[alcohol_involved == "NO"]) # # A tibble: 5 x 2 # victim_degree_of_injury ratio # <chr> <dbl> # 1 complaint of pain 0.0806 # 2 killed 0.726 # 3 no injury 0.0833 # 4 other visible injury 0.231 # 5 severe injury 0.419 
Sign up to request clarification or add additional context in comments.

2 Comments

Wow, that is almost exactly the same as the answer here, but I guess I couldn't figure out how to adapt it to my particular circumstance. Thank you!
Good for you for doing research and connecting those dots! But yeah, it is exactly the same except for the names of the columns.
1

In base R If the structure is maitained such that there is always a YES and a NO, then you could do

aggregate(N~victim_degree_of_injury, df[order(df$alcohol_involved),], function(x)x[2]/x[1]) victim_degree_of_injury N 1 complaint of pain 0.08058852 2 killed 0.72619048 3 no injury 0.08333333 4 other visible injury 0.23064044 5 severe injury 0.41888298 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.