4

I need to find many substring in a string. I downloand an internet page and put it into a string. Then I've to see if the page contains some string (substring).

Now I'm using regex whit the boost library, because I use it to use the regex pattern ([0-9] ect..).

The question is: If I need only to find a substring in a string, which is the fastest way?

1
  • Stating that the linked "duplicate" has anything of practical value for this bog standard everyday case is borderline hypocrisy... That other page is practically an academic/research discussion. It has NOTHING whatsoever to do with this question, which is essentially about just what C++ functions to call for some HTML scraping... Commented Jan 16, 2022 at 0:22

1 Answer 1

2

There are algorithms for substring searching. Here you can find comparison with example code: http://old.blog.phusion.nl/2010/12/06/efficient-substring-searching/

Boyer-Moore-Horspool wins the benchmark. https://en.wikipedia.org/wiki/Boyer–Moore–Horspool_algorithm

Sign up to request clarification or add additional context in comments.

2 Comments

Link only answers are not helpful, as links can rot. Please consider posting some relevant explanation in addition to the link.
@BatCoder ok. So in general there are few algorithms that do the task from topic: Boyer-Moore, Boyer-Moore-Horspool, Turbo Boyer-Moore, Knuth-Morris-Pratt. All of them use different techniques in order to find substring. Boyer-Moore is using bad character table with good suffix table, Boyer-Moore-Horspool is using only bad character table, Turbo Boyer-Moore just takes less steps in comparison to origin Boyer-Moore, Knuth-Morris-Pratt is based on partial mach table. Comparison was won by Boyer-Moore-Horspool. And like you said there is similar stackoverflow question with brief explanation.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.