0

I have this dataframe (but let's imagine it with many columns/variables)

df = data.frame(x = c(0,0,0,1,0), y = c(1,1,1,0,0), z = c(1,1,0,0,1)) 

I want to subset this dataset based on the condition that (x=1) and (y=0 or z = 0 or etc..)

I am already familiar with the basic function that works for small datasets, but I want a function that works for bigger datasets. Thanks

0

2 Answers 2

1

You can make use of Reduce(). The function + basically works as an OR operator since its result is >0 if it contains any TRUE value.

Correspondingly, * would work as an AND since it only returns a value >0 if all cases are TRUE.

df = data.frame(x = c(0,0,0,1,0), y = c(1,1,1,0,0), z = c(1,1,0,0,1)) nms <- names(df) # take all variables except for `x` nms_rel <- setdiff(nms, "x") nms_rel #> [1] "y" "z" # filter all rows in which `x` is 1 AND any other variable is 0 df[df$x == 1 & Reduce(`+`, lapply(df[nms_rel], `==`, 0)) > 0, ] #> x y z #> 4 1 0 0 
Sign up to request clarification or add additional context in comments.

Comments

1

In base R you can filter a dataframe like this

subset(df, df$x == 1 & (df$y == 0 | df$z == 0)) 

Another option is to use filter from the dplyr package.

library(dplyr) filter(df, x == 1, y == 0 | z == 0) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.