0

I have a dataframe, DATA, containing information on students and their test dates. I want to create a variable called WANT, where for all the STUDENTs you count the unique months (not the unique rows) as shown in the sample of the WANT variable below:

library(dplyr) set.seed(0) DATA <- data.frame("STUDENT" = sample(1:5, 100, r = T), "TESTDATE" = sample(seq(as.Date('2010/01/01'), as.Date('2010/12/31'), by="day"), 100, r=T)) DATA <- DATA %>% arrange(STUDENT, TESTDATE) DATA$WANT <- c(1,1,1,2,2,3,3,4,4,5,5,6,7,7,8,8,9,1,1,1,2,3,3,4,5,5,6,7,8,8,9,10,10, rep(NA, 67)) 

My attempt only does rows and it's not what I wish for

DATA %>% group_by(STUDENT) %>% mutate(WANT = 1:n()) 
0

2 Answers 2

1

We may extract the month part and use match or as.integer(factor(WANT2 levels = unique(WANT2)))

library(dplyr) out <- DATA %>% group_by(STUDENT) %>% mutate(WANT2 = as.integer(format(TESTDATE, '%m')), WANT2 = match(WANT2, unique(WANT2))) %>% ungroup 

-output

> head(out, 22) %>% as.data.frame STUDENT TESTDATE WANT WANT2 1 1 2010-01-20 1 1 2 1 2010-01-31 1 1 3 1 2010-02-10 2 2 4 1 2010-02-10 2 2 5 1 2010-03-27 3 3 6 1 2010-04-20 4 4 7 1 2010-04-21 4 4 8 1 2010-05-02 5 5 9 1 2010-05-06 5 5 10 1 2010-05-13 5 5 11 1 2010-05-20 5 5 12 1 2010-06-17 6 6 13 1 2010-08-22 7 7 14 1 2010-08-25 7 7 15 1 2010-08-27 7 7 16 1 2010-08-30 7 7 17 1 2010-09-06 8 8 18 1 2010-09-30 8 8 19 1 2010-10-27 9 9 20 1 2010-10-31 9 9 21 1 2010-12-10 10 10 22 1 2010-12-21 10 10 

If we want the year-month to count as separate, then do

out <- DATA %>% group_by(STUDENT) %>% mutate(WANT2 = format(TESTDATE, '%Y-%m'), WANT2 = match(WANT2, unique(WANT2))) %>% ungroup 
Sign up to request clarification or add additional context in comments.

10 Comments

i forget to add the seed!
but i don't want to extract just month because sometimes they start at month 7 and i want it to be 1
yes i update it now
what is difference between WANT, and WANT2
i notice a problem if a STUDENT is in for more than 12 months then it goes only up to 12. for example imagine if they are there every month of 2010 and then 01/2011 it doesn't count it as 13
|
1

You can use data.table::rleid:

library(data.table) setDT(DATA)[, WANT:=rleid(month(TESTDATE)), STUDENT] 

Output:

 STUDENT TESTDATE WANT <int> <Date> <num> 1: 1 2010-01-20 1 2: 1 2010-01-31 1 3: 1 2010-02-10 2 4: 1 2010-02-10 2 5: 1 2010-03-27 3 6: 1 2010-04-20 4 7: 1 2010-04-21 4 8: 1 2010-05-02 5 9: 1 2010-05-06 5 10: 1 2010-05-13 5 11: 1 2010-05-20 5 12: 1 2010-06-17 6 13: 1 2010-08-22 7 14: 1 2010-08-25 7 15: 1 2010-08-27 7 16: 1 2010-08-30 7 17: 1 2010-09-06 8 18: 1 2010-09-30 8 19: 1 2010-10-27 9 20: 1 2010-10-31 9 21: 1 2010-12-10 10 22: 1 2010-12-21 10 23: 2 2010-01-19 1 24: 2 2010-02-05 2 25: 2 2010-02-09 2 26: 2 2010-02-18 2 27: 2 2010-02-19 2 28: 2 2010-03-16 3 29: 2 2010-03-18 3 30: 2 2010-04-02 4 31: 2 2010-04-12 4 32: 2 2010-04-26 4 33: 2 2010-05-04 5 34: 2 2010-05-10 5 35: 2 2010-07-31 6 36: 2 2010-09-04 7 37: 2 2010-09-22 7 38: 2 2010-10-13 8 39: 2 2010-10-24 8 40: 2 2010-11-02 9 41: 2 2010-11-24 9 42: 2 2010-12-21 10 43: 2 2010-12-22 10 44: 2 2010-12-28 10 45: 3 2010-01-16 1 46: 3 2010-03-08 2 47: 3 2010-04-15 3 48: 3 2010-04-23 3 49: 3 2010-04-27 3 50: 3 2010-05-09 4 51: 3 2010-05-18 4 52: 3 2010-06-01 5 53: 3 2010-10-27 6 54: 3 2010-11-01 7 55: 3 2010-11-05 7 56: 3 2010-11-12 7 57: 3 2010-11-24 7 58: 3 2010-12-22 8 59: 3 2010-12-27 8 60: 3 2010-12-30 8 61: 4 2010-01-14 1 62: 4 2010-01-27 1 63: 4 2010-03-05 2 64: 4 2010-03-06 2 65: 4 2010-03-06 2 66: 4 2010-03-14 2 67: 4 2010-03-19 2 68: 4 2010-04-08 3 69: 4 2010-06-09 4 70: 4 2010-06-23 4 71: 4 2010-07-13 5 72: 4 2010-07-26 5 73: 4 2010-08-06 6 74: 4 2010-09-28 7 75: 4 2010-09-30 7 76: 4 2010-11-23 8 77: 4 2010-12-15 9 78: 4 2010-12-24 9 79: 4 2010-12-25 9 80: 5 2010-01-15 1 81: 5 2010-01-16 1 82: 5 2010-02-20 2 83: 5 2010-02-22 2 84: 5 2010-03-13 3 85: 5 2010-04-26 4 86: 5 2010-05-01 5 87: 5 2010-05-07 5 88: 5 2010-05-10 5 89: 5 2010-05-15 5 90: 5 2010-07-31 6 91: 5 2010-08-07 7 92: 5 2010-08-17 7 93: 5 2010-09-03 8 94: 5 2010-09-11 8 95: 5 2010-10-21 9 96: 5 2010-11-01 10 97: 5 2010-11-02 10 98: 5 2010-11-03 10 99: 5 2010-11-05 10 100: 5 2010-12-02 11 STUDENT TESTDATE WANT 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.