I am having a hard time working with the dplyr library. I have been trying to implement a relatively easy piece of code but for some reason when I group by one variable and try to sum to get the total for that variable I get only NA values. Here are my files:
https://www.dropbox.com/sh/zhxfj6cm6gru0t1/AAA-DgeTrngJ0md12W2bEzi0a
And this is code:
library (dplyr) #we set the working directory setwd("~/asado/R/emp") ##we list the files list.files() ##we load the csv files emp1 <- read.csv("AI_EMP_CT_A.csv", sep=',') ##emp1 contains employment information for US counties with naics classification ##empva is another part of the same dataset empva <- read.csv("AI_EMP_CT_VA_A.csv", sep=',') ##we merge our files, they have the same dimentions so rbind works emp <- data.frame(rbind(emp1, empva)) ##we create a variable to summarize our data ##and make sure is stored as character emp$naics <- as.character(substring(emp$Mnemonic,3,6)) ##we try to summarize by the variable naics, summing for Dec.2013 useemp<- emp%.% group_by(naics) %.% summarize(total=sum(Dec.2013, na.rm=T)) ##the resulting dataframe shows NA head(useemp) Any idea what's going on?
na.rmnotrm.na.dplyrto the latest version (where %>% replaced %.% although I can still be used) and usedplyr::summarize(total=sum(Dec.2013, na.rm=T))to make sure you're not in conflict withplyr. Does that change anything?