1

I have the following data frame:

dat <- data.frame(Name = c("John", "Company A pty Ltd", ""), Surname = c("Smith", "", "Company B"), Company = c("Company D", "A Ltd", "Company B")) 

I want to check if the Company column contains any word that is either in the firstName or the Surname.

I have used the following code:

dat$clinicOnly <- mapply(grepl, pattern=dat$firstName, dat$Company) 

But what it checks is the whole string is present. Hence it works for the first row, but misses the second row, and gets the last row correct because it detected a blank firstName entry.

How to write a function that produces FALSE, TRUE, TRUE ?

2
  • 1
    Don't have time for a long answer but try: apply(dat, 1, function(x){grepl(paste(x[1:2], collapse="|"), x[3])}) Commented Apr 20, 2016 at 3:37
  • Works very well, thanks! Commented Apr 20, 2016 at 4:20

1 Answer 1

1

How about this, using intersect to do the hard work?:

v1 <- strsplit(do.call(paste, dat[1:2]), "\\s+") v2 <- strsplit(as.character(dat$Company), "\\s+") mapply(function(x,y) length(intersect(x,y)) > 1, v1, v2) #[1] FALSE TRUE TRUE 
Sign up to request clarification or add additional context in comments.

1 Comment

Also answers the question, just with more lines of code.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.