2

Even at the risk of being this question labeled as duplicated, I am going to ask since all the related questions I have checked do not solve my problem...

I have a labs vector and I want to find the elements that are exact matches to 3 groups stored in a groups variable.

set.seed(1) labs <- sample(c(rep('BC-89HX',3), rep('BC-89HX with 2% Puricare + 5% Merquat',3), rep('Own SH',4)), 10) labs groups <- c('BC-89HX','BC-89HX with 2% Puricare + 5% Merquat','Own SH') 

I want to identify the "BC-89HX" group elements (not the "BC-89HX with 2% Puricare + 5% Merquat" ones)

grep(groups[1], labs, val=TRUE, fixed=TRUE) #finds more elements than the ones I need grep(paste(groups[1],"$",sep=""), labs, val=TRUE, fixed=TRUE) #does not work grep(paste("\\b",groups[1],"\\b",sep=""), labs, val=TRUE, fixed=TRUE) #does not work 

Any help?

4
  • 2
    The does not work is not clear. In the first case of 'grep', I get 6 matches out of the 10 elements in 'labs'. What is your expected output? Commented Mar 12, 2018 at 7:22
  • Only the "BC-89HX" exact group elements, not the "BC-89HX with 2% Puricare + 5% Merquat" ones Commented Mar 12, 2018 at 8:25
  • Are you looking for grep(paste0("^", groups[1], "$"), labs, val=TRUE)# [1] "BC-89HX" "BC-89HX" "BC-89HX" In that case you can use == as well labs[labs == groups[1]] Commented Mar 12, 2018 at 8:29
  • Yes exactly! paste0! I was trying grep(paste(groups[1], "$"), labels(dend.obj), val=TRUE, fixed=TRUE) cause groups[1] contains - Commented Mar 12, 2018 at 8:31

1 Answer 1

2

The solution to be make sure that "BC-89HX" is the only characters in the string and by pasteing ^ and $ we identify the starting and end position

grep(paste0("^", groups[1], "$"), labs, value=TRUE) #[1] "BC-89HX" "BC-89HX" "BC-89HX" 

In this case, we cannot use the fixed = TRUE as ^ and $ are metacharacters which imply the start and end location. If we do fixed = TRUE, it will parse it as literal characters which the 'labs' doesn't have

Another option is to use == or %in% as we are comparing fixed strings instead of matching substring in a string

labs[labs == groups[1]] #[1] "BC-89HX" "BC-89HX" "BC-89HX" labs[labs == groups[2]] #[1] "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" 

Update

If we really wanted to use grep with fixed = TRUE, then one way is to paste in both the pattern and the strings with the same characters i.e.

labs[grep(paste0("^", groups[2], "$"), paste0("^", labs, "$"), fixed = TRUE) ] #[1] "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" labs[grep(paste0("^", groups[1], "$"), paste0("^", labs, "$"), fixed = TRUE) ] #[1] "BC-89HX" "BC-89HX" "BC-89HX" 
Sign up to request clarification or add additional context in comments.

3 Comments

What about if I want to grep group[2]? Is there a unique solution that works for both cases?
@DaniCee If you do the ==, it should be working for both cases labs[labs== groups[2]]
Please check my new question, I don't know why this solution doesn't work there... stackoverflow.com/questions/49269338/…

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.