3

I'm trying to drop columns that have more than 90% of NA values present, I've followed the following but I only get a values in return, not sure what I can be doing wrong. I would be expecting an actual data frame, I tried putting as.data.frame in front but this is also erroneous.

Linked Post: Delete columns/rows with more than x% missing

Example DF

gene cell1 cell2 cell3 A 0.4 0.1 NA B NA NA 0.1 C 0.4 NA 0.5 D NA NA 0.5 E 0.5 NA 0.6 F 0.6 NA NA 

Desired DF

gene cell1 cell3 A 0.4 NA B NA 0.1 C 0.4 0.5 D NA 0.5 E 0.5 0.6 F 0.6 NA 

Code

#Select Genes that have NA values for 90% of a given cell line df_col <- df[,2:ncol(df)] df_col <-df_col[, which(colMeans(!is.na(df_col)) > 0.9)] df <- cbind(df[,1], df_col) 

2 Answers 2

5

I would use dplyr here.

If you want to use select() with logical conditions, you are probably looking for the where() selection helper in dplyr. It can be used like this: select(where(condition))

I used a 80% threshold because 90% would keep all columns and would therefore not illustrate the solution as well

library(dplyr) df %>% select(where(~mean(is.na(.))<0.8)) 

It can also be done with base R and colMeans:

df[, c(TRUE, colMeans(is.na(df[-1]))<0.8)] 

or with purrr:

library(purrr) df %>% keep(~mean(is.na(.))<0.8) 

Output:

 gene cell1 cell3 1 a 0.4 NA 2 b NA 0.1 3 c 0.4 0.5 4 d NA 0.5 5 e 0.5 0.6 6 f 0.6 NA 

Data

df<-data.frame(gene=letters[1:6], cell1=c(0.4, NA, 0.4, NA, 0.5, 0.6), cell2=c(0.1, rep(NA, 5)), cell3=c(NA, 0.1, 0.5, 0.5, 0.6, NA)) 
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, the answer's in general are perfect for what I was looking :D
Keep_em all brother :)
1

Well, cell3 has 83% NA values (5/6) but anyway you can do -

ignore <- 1 perc <- 0.8 #80 % df <- cbind(df[ignore], df[-ignore][colMeans(is.na(df[-ignore])) < perc]) df # gene cell1 cell3 #1 A 0.4 NA #2 B NA 0.1 #3 C 0.4 0.5 #4 D NA 0.5 #5 E 0.5 0.6 #6 F 0.6 NA 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.