I have this dataset (example):
dt <- data.table(ID = c(1,1,1,2,2,3,4,5,5,5), diagnosis = c("cancer", "cancer", "cancer", "cancer", "cancer", "cancer", "cancer", "cancer", "cancer", "cancer"), Date = c(2008,2001,2013,2008,2013,2013,2013,2001,2002,2013)) I ONLY want patients with a first diagnosis in 2013. So any other year should be out of the dataset.
However a patient should not be counted in the new dataset if the patients has a diagnosis in 2008. If the patient hav had a diagnosis before 2008, then we wil keep them, with their 2013 diagnosis.
So the final dataset will look like this:
ID diagnosis Date 3 cancer 2013 4 cancer 2013 5 cancer 2013 How can I do so by using data.table
dput()