6

I have some strings:

"rose with ribbon" "roses in concrete" "roses on bed" 

I have to write a program to find string where preffered word exists

E.g: find string where "on" is, so I need to get only "roses on bed".

I used this code:

foreach (KeyWord key in cKeyWords) { foreach (string word in userWords) { if (key.keyWord.IndexOf(word) != -1) { ckeyList.Add(key); } } } 

but I get all strings because IndexOf finds "on" in all of them.

Is there any other solution to find separate word in string without splitting? Maybe it is possible to use Linq or Regex? but I'm not good at using them so would be nice to have any examples.

8
  • 2
    Why don't you want to split the string? Commented Sep 15, 2012 at 17:43
  • You could search for " on " with spaces to eliminate the hits you don't want. Commented Sep 15, 2012 at 17:43
  • @gjvdkamp That won't work if the word is at the start or end of the string. Commented Sep 15, 2012 at 17:45
  • @gjvdkamp, that wouldn't catch the cases where the strings either start or end with "on", so two more cases to handle. Commented Sep 15, 2012 at 17:46
  • That can be remedied by adding a space on both ends of the string before searching, but it is a bit of a hack.. Commented Sep 15, 2012 at 17:46

5 Answers 5

6

Using regex with \bon\b should do it.

\b is the regex anchor for word boundary, so that regex will match a word boundary immediately followed by on immediately followed by another word boundary.

The following C# example...

 string[] sArray = new string[] { "rose with ribbon", "roses on bed", "roses in concrete" }; Regex re = new Regex("\\bon\\b"); foreach (string s in sArray) { Console.Out.WriteLine("{0} match? {1}", s, re.IsMatch(s)); Match m = re.Match(s); foreach(Group g in m.Groups) { if (g.Success) { Console.Out.WriteLine("Match found at position {0}", g.Index); } } } 

... will generate the following output:

 rose with ribbon match? False roses on bed match? True Match found at position 6 roses in concrete match? False 
Sign up to request clarification or add additional context in comments.

4 Comments

Could you explain what does that regex do?
\b is the regex anchor for word boundary -- so that regex looks for a word boundary followed by on followed by another word boundary. See regular-expressions.info/wordboundaries.html.
I think you should include that in your answer.
can you explain how to use that, no idea
1

Yes, By using Regex you can find word in string. Try With,

string regexPattern; foreach (KeyWord key in cKeyWords) { foreach (string word in userWords) { regexPattern = string.Format(@"\b{0}\b", System.Text.RegularExpressions.Regex.Escape(word)); if (System.Text.RegularExpressions.Regex.IsMatch(key.keyWord, regexPattern)) { ckeyList.Add(key); } } } 

Use ToLower() method on string if you don't want to consider with case sensitive.

 foreach (KeyWord key in cKeyWords) { foreach (string word in userWords) { regexPattern = string.Format(@"\b{0}\b", System.Text.RegularExpressions.Regex.Escape(word.ToLower())); if (System.Text.RegularExpressions.Regex.IsMatch(key.keyWord.ToLower(), regexPattern)) { ckeyList.Add(key); } } } 

Comments

0

Use regular expressions, read this article: http://www.dotnetperls.com/regex-match

And here is another good article to study regex: http://www.codeproject.com/Articles/9099/The-30-Minute-Regex-Tutorial

1 Comment

This doesn't actually answer the question. How would you solve this specific problem using regex?
0

The problem is that you're searching for "on" which is found in all strings (ribb*on*, c*on*crete)

You should be searching for " on ".

A better solution would be to parse the strings into arrays of words and iterate over those.

2 Comments

Including spaces before and after won't work if the word you're looking for appears at the beginning or end of the string. Parsing the strings into words is unnecessary, not to mention non-performant if you have 1000's of long sentences.
As I mentioned in a comment, this won't work if the word is at the start or end of the string. And the question specifically asks about solutions that don't involve splitting the string (for whatever reason).
0

In a nutshell, this is what you could do (replacing the appropriate StartsWith and EndsWith for C# String class).

foreach (KeyWord key in cKeyWords) { foreach (string word in userWords) { if (key.keyWord.IndexOf(" " + word + " ") != -1 || key.keyWord.StartsWith(word + " ") || key.keyWord.EndsWith(" " + word)) { ckeyList.Add(key); } } 

2 Comments

StartsWith() and EndsWith() don't return an integer.
was a bit lazy so just left the note about correcting them in the answer :) anyway corrected it now.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.