Pnomolos/faster #2
Open
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
@joelparkerhenderson After running way too many benchmarks and trying to come up with far too many possible algorithms I believe I have a winner - results here.
I tested the various algorithms (using benchmark-ipsa) against several sources (wikipedia articles and project gutenberg books). Not included in the benchmarks are also some tests against single characters, and against the entire set of characters. I realized we should probably add tests for various encodings as well, I'll try and add those in soonish.
The libunac implementation also blows everything else away (as I expected) but it needs some cleanup and a few more checks.
As well, per my comment (#1 (comment)) should I change the output of those two characters?
Algorithm three was a very interesting one as you'll notice from the results.
very low memory usage (as I believe it would be a single allocation in C) but speed results were all over the place. I'm curious if there's some sort of cutoff in terms of string length where we could alternate between the two, but I haven't had a chance to investigate that far.