0

Say I have a dataframe ARAP with columns called CoCd and VendorNo. I want to subset into another dataframe called EMIU_EMIJ all lines for combinations of:

CoCd="EMIJ" & VendorNo = "100010" or CoCd="EMIU" & VendorNo = "2000001" or CoCd="EMIU" & VendorNo = "2000006". 

How do I combine & and | to select the lines where both combinations are met ? I.e. it needs to pair the CoCd and VendorNo combinations together.

I tried

EMIU_EMIJ<-subset(ARAP,CoCd=="EMIJ"&VendorNo=="100010"| CoCd=="EMIU"&VendorNo=="2000001"| CoCd=="EMIU"&VendorNo=="2000006") 

I also tried brackets

EMIU_EMIJ<-subset(ARAP, (CoCd=="EMIJ"&VendorNo=="100010")|(CoCd=="EMIU"&VendorNo=="2000001")|(CoCd=="EMIU"&VendorNo=="2000006")) 

But this created an error:"Error: unexpected symbol in:"EMIU_EMIJ"

How do I subset for 1 of the 3 combinations mentioned above ?

3
  • 3
    Your question is inconsistent - you say, that you have dataframe EMIU_EMIJ, but you subset from object ARAP. The condition in the last line is correct and the error points to another error. Could you post full error and also sample data (use i.e. dput) Commented Feb 14, 2014 at 6:54
  • For a more accurate answer, update your question as Zbynek mentionned. You're probably looking for the function extract of the base package: see the Warning from the subset documentation. Commented Feb 14, 2014 at 9:16
  • Zbynek, you are correct, I am subsetting the DF called ARAP to create a DF called EMIU_EMIJ. Thank you, D Commented Feb 14, 2014 at 19:53

1 Answer 1

1

a simple merge with all.y option will do.

for example if mydf is your data

set.seed(111) mydf <- data.frame(id=rep(LETTERS, each=4)[1:100], replicate(3, sample(1001, 100)),Class=sample(c("Yes", "No"), 100, TRUE)) mydf$CoCd <- paste0("EMI",mydf$id) mydf$VendorNo <- paste0(mydf$X1,mydf$X2) mydf <- unique(mydf[,c("CoCd","VendorNo","Class","X3")]) 

and looks like this

 CoCd VendorNo Class X3 1 EMIA 594577 Yes 727 2 EMIA 727137 Yes 921 3 EMIA 371939 Yes 123 4 EMIA 514176 No 950 5 EMIB 377818 Yes 668 6 EMIB 41713 No 85 7 EMIB 11637 No 579 8 EMIB 530266 No 212 9 EMIC 430566 Yes 241 10 EMIC 93958 No 533 11 EMIC 551197 Yes 176 12 EMIC 585686 No 565 13 EMID 67827 Yes 154 14 EMID 47894 No 469 15 EMID 155952 No 718 16 EMID 441649 No 835 17 EMIE 169541 Yes 945 18 EMIE 952871 Yes 452 19 EMIE 306441 No 358 20 EMIE 604730 No 920 21 EMIF 423407 No 868 22 EMIF 280668 Yes 658 23 EMIF 335907 Yes 830 24 EMIF 379620 Yes 841 25 EMIG 946644 No 471 

and you want the combinations

combination_to_select<-data.frame(CoCd=c("EMIA","EMID","EMIF"),VendorNo=c('594577','47894','423407'),stringsAsFactors=FALSE) combination_to_select CoCd VendorNo 1 EMIA 594577 2 EMID 47894 3 EMIF 423407 

the following code gives you the subset

subset <- merge(mydf,combination_to_select,by=c("CoCd","VendorNo"),all.y=TRUE) CoCd VendorNo Class X3 1 EMIA 594577 Yes 727 2 EMID 47894 No 469 3 EMIF 423407 No 868 
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Rfan, "plus 1" ! D

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.