1

I have a data frame which is in long format. I have multiple cities. Each of the cities has information for each month and information also for each code (this code goes from 100 up to 1,000). My dataframe looks like this:

Code City month Data
100 A 10 0
100 B 12 1
100 A 10 2
100 B 12 3
100 A 10 4
100 B 12 5
200 A 10 10
200 B 12 11
200 A 10 12
200 B 12 13
200 A 10 14
200 B 12 15

I´m trying to create a new var that adds up the information in the Data variable foreach month when the variable Code is equal to 100. So for the 10th month I would have a result of 6, and for the 12th month I would have a result of 9:

Code
6
9
6
9
6
9
6
9
6
9
6
9

For this I´m using dplyr:

df <- df %>% group_by(month) %>% mutate(newvar =case_when(Code==100 ~ as.integer(rowSums(select_(., "Data"), na.rm = TRUE)))) 

However, I´m getting an error and I haven´t been able to create this new variable correctly. I know that an easier way would be using base R. But I want to use dplyr.

Any help is really appreciate it!

0

2 Answers 2

1

You can sum the Data value only where Code = 100 for each month.

library(dplyr) df %>% group_by(month) %>% mutate(newvar = sum(Data[Code == 100], na.rm = TRUE)) %>% ungroup 
Sign up to request clarification or add additional context in comments.

Comments

1

We can also do

library(dplyr) df %>% group_by(month) %>% mutate(newvar = sum(case_when(Code == 100 ~ Data), na.rm = TRUE)) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.