read_excel and NA values

Question

I am trying to import my data using read_excel but I need to interpret any NA string value to missing values but I am stuck

Currently, my data has NAs all over the place and I need them to be blank so that when I run colsums(is.na(data)) it shouldn't show 0s

NA is chr in this table within numeric numbers as you can see in screenshot

data <- read_excel(workbook_path, na = c("")) colSums(is.na(data))

Does this answer your question? Using read_excel(na = ) how do you specify more than one NA character string? — Arthur Yip
– Arthur Yip, Commented Jan 26, 2023 at 0:36
@ArthurYip I tried that code but {r} data <- read_excel(workbook_path, na = c("", "NA")) print(data) colSums(is.na(data)) I keep seeing 0 which means I have NAs still — Issam K
– Issam K, Commented Jan 26, 2023 at 0:37
I just ran sum(is.na(data)) and I get 40 which means my NAs didn't disappear — Issam K
– Issam K, Commented Jan 26, 2023 at 0:39
ahhhhhhh you are right, I was interpreting the question wrong !!!!!!! seeing 40 is the right way. thank you — Issam K
– Issam K, Commented Jan 26, 2023 at 1:12

margusl · Accepted Answer · 2023-01-26 01:07:36Z

Using one of readxl example files for reproducible example, you can open its location by running browseURL(dirname(readxl_example("type-me.xlsx"))), though the sheet looks like this:

library(readxl) library(dplyr) xlsx <- readxl_example("type-me.xlsx") # open file location explorer: # browseURL(dirname(readxl_example("type-me.xlsx"))) # by default blank cells are treated as missing data, note the single <NA>: df <- read_excel(xlsx, sheet = "text_coercion") %>% head(n = 2) df #> # A tibble: 2 × 2 #> text explanation #> <chr> <chr> #> 1 <NA> "empty" #> 2 cabbage "\"cabbage\"" # add "empty" to na vector, note 2 <NA> values: df <- readxl::read_excel(xlsx, sheet = "text_coercion", na = c("", "empty")) %>% head(n = 2) df #> # A tibble: 2 × 2 #> text explanation #> <chr> <chr> #> 1 <NA> <NA> #> 2 cabbage "\"cabbage\"" # to replace all(!) NA values with "" df[is.na(df)] <- "" df #> # A tibble: 2 × 2 #> text explanation #> <chr> <chr> #> 1 "" "" #> 2 "cabbage" "\"cabbage\""

^{Created on 2023-01-26 with reprex v2.0.2}

Note from your screenshot: you have column names in the first row of your dataframe, this breaks data type detection (everything is chr) and you should deal with that first; at that point data[is.na(data)] <- "" will no longer work as you can not write strings to numerical columns. And it's perfectly fine.

Collectives™ on Stack Overflow

read_excel and NA values

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related