1

I have this data.

group name 1 A 1 A 1 A 1 B 1 C 2 A 2 B 3 A 3 B 3 C 3 D 

I would like to filter the group with a standard. For example, I would like to filter the group inside {A, B, C}.

Group 1 would be filtered because {A, B, C} (unique combination of Group 1) is in {A, B, C}.

Group 2 would be filtered because {A, B} is in {A, B, C}

However, Group 3 would not be filtered because {A, B, C, D} is not the subset of {A, B, C}.

How should I approach this issue? Additionally, I have more standards (i.e., {A, B, C} and {A, C} ...).

structure(list(group = c(1, 1, 1, 1, 1, 2, 2, 3, 3, 3, 3), name = c("A", "A", "A", "B", "C", "A", "B", "A", "B", "C", "D")), row.names= c(NA, -11L), class = c("tbl_df", "tbl", "data.frame")) 
3
  • Do you need df1 %>% group_by(group) %>% filter(all(unique(name) %in% c("A", "B", "C"))) or its opposite df1 %>% group_by(group) %>% filter(!all(unique(name) %in% c("A", "B", "C"))) Commented Jan 6, 2020 at 23:29
  • What is your expected output when you have more standards ? Commented Jan 7, 2020 at 0:25
  • In the below answer, 'filter(!(all(stdvec2 %in% name) | all(stdvec1 %in% name)))' is the right approach! Commented Jan 7, 2020 at 1:33

1 Answer 1

2

We can specify the standard vector and do a group_by filter

stdvec <- c("A", "B", "C") library(dplyr) df1 %>% group_by(group) %>% filter(all(unique(name) %in% stdvec)) 

and its reverse

df1 %>% group_by(group) %>% filter(!all(unique(name) %in% stdvec)) 

If there are more vectors, it could be

stdvec1 <- c("A", "B", "C") stdvec2 <- c("A", "C") df1 %>% group_by(group) %>% filter(all(stdvec2 %in% name) & all(stdvec1 %in% name)) 

and its reverse

df1 %>% group_by(group) %>% filter(!(all(stdvec2 %in% name) & all(stdvec1 %in% name))) 

Or it could be a union of the the multiple vectors compared with the unique values of 'name' and check if all are included (and negate !)

df1 %>% group_by(group) %>% filter(!all(unique(name) %in% union(stdvec1, stdvec2))) 

and

 df1 %>% group_by(group) %>% filter(all(unique(name) %in% union(stdvec1, stdvec2))) 

If there are many vectors, use reduce to union

library(purrr) nm1 <- mget(ls(pattern = "^stdvec\\d+$")) %>% reduce(union) df1 %>% group_by(group) %>% filter(all(unique(name) %in% nm1)) 
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. If I have more stdvec (standards), How should I apply that? for exmapme, stdvec1 <- c("A", "B", "C") stdvec2 <- c("A", "C")
Thanks. What I do is 'filter(!(all(stdvec2 %in% name) | all(stdvec1 %in% name))) ' format, but there are 30 standards so I should expand the '|' 30 times. Is there any solution?
@Juhyeon No, that can be done easily lst1 <- mget(ls(pattern = "^stdvec\\d+$")); df1 %>% group_by(group) %>% filter(map(lst1, ~ all(.x %in% name)) %>% reduce(|) %>% !)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.