Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

12
  • "For needles of length 1, use strchr."- You asked for the fastest substring search algorithm(s). Finding a substring of length 1 is a just a special case, one that can also be optimized. If you swap out your current special case code for substrings of length 1 (strchr) with something like the above, things will (possibly, depending on how strchr is implemented) go faster. The above algorithm is nearly 3x faster than a typical naive strchr implementation. Commented Jun 4, 2011 at 0:11
  • 2
    OP said string was properly null terminated, so your discussion about char bytes[1] = {0x55}; is irrelevant. Very relevant is your comment about this being true for any word read algorithm which does not know the length beforehand. Commented Jun 5, 2011 at 2:17
  • 1
    The problem does not apply to the version I cited because you only use it on aligned pointers - at least that's what correct implementations do. Commented Jun 5, 2011 at 3:12
  • 2
    @R, it has nothing to do with "aligned pointers". Hypothetically, if you had an architecture which supported VM protection with byte level granularity, and each malloc allocation was "sufficiently padded" on either side and the VM system enforced byte granular protection for that allocation.... whether or not the pointer is aligned (assuming trivial 32-bit int natural alignment) is moot- it is still possible for that aligned read to read past the size of the allocation. ANY read past the size of the allocation is undefined behavior. Commented Jun 5, 2011 at 19:28
  • 5
    @johne: +1 to comment. Conceptually you're right, but the reality is that byte-granularity protections are so expensive both to store and to enforce that they don't and never will exist. If you know the underlying storage is page-granularity mappings obtained from the equivalent of mmap, then alignment is sufficient. Commented Jun 13, 2011 at 2:16