I have a data.frame as below
PRODUCT=c(rep("A",4),rep("B",2)) ww1=c(201438,201440,201444,201446,201411,201412) ww2=ww1-6 DIFF=rep(6,6) DEMAND=rep(100,6) df=data.frame(PRODUCT,ww1,ww2,DIFF,DEMAND) df<- df[with(df,order(PRODUCT, ww1)),] df PRODUCT ww1 ww2 DIFF DEMAND 1 A 201438 201432 6 100 2 A 201440 201434 6 100 3 A 201444 201438 6 100 4 A 201446 201440 6 100 5 B 201411 201405 6 100 6 B 201412 201406 6 100 I want to add rows to it based upon the conditions below.
For any row in the data, if the product on the earlier row is the same as the product on the current row, but the ww1 on the earlier row is not same as the ww1-1 on the current row (basically ww1 difference is 1), then add a new row.
For the newly added row:
Product will be the same as product on earlier row. ww1 will be ww1 on the earlier row + 1 ww2 will be ww2 on the earlier row + 1 ww_diff will be 6 demand will be 0 The final output that I need is something like below:
PRODUCT ww1 ww2 WW_DIFF DEMAND A 201438 201432 6 100 A 201439 201433 6 0 A 201440 201434 6 100 A 201441 201435 6 0 A 201442 201436 6 100 A 201443 201437 6 0 A 201444 201438 6 100 A 201445 201439 6 0 A 201446 201440 6 100 B 201411 201405 6 100 B 201412 201406 6 100 As of now I am thinking of writing a macro in excel, but it will be very slow and therefore I would prefer a R solution
update1===============================
How could I add column seq? that column is 1 for earliest entry of ww1 of every product and then it increments by 1.
PRODUCT ww1 ww2 WW_DIFF DEMAND seq A 201438 201432 6 100 1 A 201439 201433 6 0 2 A 201440 201434 6 100 3 A 201441 201435 6 0 4 A 201442 201436 6 100 5 A 201443 201437 6 0 6 A 201444 201438 6 100 7 A 201445 201439 6 0 8 A 201446 201440 6 100 9 B 201411 201405 6 100 1 B 201412 201406 6 100 2 update2=======================================================
I am posting questions again (I unchecked previously accepted answer of alistaire as that answer is not working on my original data, it works only on small sample of data :(
In below solution by user alistaire, df3 <- right_join(df, data.frame(ww1=ww1big)) seem to be causing issue.
In a final solution, I would also prefer if columns are specified by their names. That way I won't be forced to arrange columns in a particular order.
dfordered first byPRODUCTand then byww1?