Skip to content

Commit 0ae5593

Browse files
committed
Resolve #17 optimize use of '^' / '\A'
The problem was that a regular expression like "^x" would take linear time to is_match a string "aaa...." when all it needed to do was check the first character then bail out. This commit makes it so if the regex begins with '^' (without multiline) or '\A', then it stops iterating over characters as soon as there are no possible matches. Benchmarks: dynamic: anchored_literal_long_non_match: 7492 ns/iter => 435 ns/iter anchored_literal_short_non_match: 917 ns/iter => 431 ns/iter native: anchored_literal_long_non_match: 5020 ns/iter => 50 ns/iter anchored_literal_short_non_match: 375 ns/iter => 50 ns/iter
1 parent 588319a commit 0ae5593

File tree

2 files changed

+12
-1
lines changed

2 files changed

+12
-1
lines changed

regex_macros/src/lib.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,11 @@ fn exec<'t>(which: ::regex::native::MatchKind, input: &'t str,
191191
if matched {
192192
break
193193
}
194+
195+
if $prefix_anchor && self.ic != 0 {
196+
break
197+
}
198+
194199
$check_prefix
195200
}
196201
if clist.size == 0 || (!$prefix_anchor && !matched) {

src/vm.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,12 +144,18 @@ impl<'r, 't> Nfa<'r, 't> {
144144
break
145145
}
146146

147+
// If the expression starts with a '^' we can terminate as soon
148+
// as the last thread dies.
149+
if self.ic != 0 && prefix_anchor {
150+
break;
151+
}
152+
147153
// If there are no threads to try, then we'll have to start
148154
// over at the beginning of the regex.
149155
// BUT, if there's a literal prefix for the program, try to
150156
// jump ahead quickly. If it can't be found, then we can bail
151157
// out early.
152-
if self.prog.prefix.len() > 0 && clist.size == 0 {
158+
if self.prog.prefix.len() > 0 {
153159
let needle = self.prog.prefix.as_bytes();
154160
let haystack = &self.input.as_bytes()[self.ic..];
155161
match find_prefix(needle, haystack) {

0 commit comments

Comments
 (0)