Skip to main content
Add new resources
Source Link
James Davis
  • 988
  • 8
  • 13

Detecting evil regexes

Try Nicolaas Weideman's RegexStaticAnalysis project.

  1. Try Nicolaas Weideman's RegexStaticAnalysis project.
  2. Try my ensemble-style vuln-regex-detector which has a CLI for Weideman's tool and others.

Rules of thumb

Evil regexes are always due to ambiguity in the corresponding NFA, which you can visualize with tools like regexper.

Here are some forms of ambiguity. Don't use these in your regexes.

  1. Nesting quantifiers like (a+)+ (aka "star height > 1"). This can cause exponential blow-up. See substack's safe-regex tool.
  2. Quantified Overlapping Disjunctions like (a|a)+. This can cause exponential blow-up.
  3. Avoid Quantified Overlapping Adjacencies like \d+\d+. This can cause polynomial blow-up.

Additional resources

I wrote this paper on super-linear regexes. It includes loads of references to other regex-related research.

Detecting evil regexes

Try Nicolaas Weideman's RegexStaticAnalysis project.

Rules of thumb

Evil regexes are always due to ambiguity in the corresponding NFA, which you can visualize with tools like regexper.

Here are some forms of ambiguity. Don't use these in your regexes.

  1. Nesting quantifiers like (a+)+ (aka "star height > 1"). This can cause exponential blow-up. See substack's safe-regex tool.
  2. Quantified Overlapping Disjunctions like (a|a)+. This can cause exponential blow-up.
  3. Avoid Quantified Overlapping Adjacencies like \d+\d+. This can cause polynomial blow-up.

Detecting evil regexes

  1. Try Nicolaas Weideman's RegexStaticAnalysis project.
  2. Try my ensemble-style vuln-regex-detector which has a CLI for Weideman's tool and others.

Rules of thumb

Evil regexes are always due to ambiguity in the corresponding NFA, which you can visualize with tools like regexper.

Here are some forms of ambiguity. Don't use these in your regexes.

  1. Nesting quantifiers like (a+)+ (aka "star height > 1"). This can cause exponential blow-up. See substack's safe-regex tool.
  2. Quantified Overlapping Disjunctions like (a|a)+. This can cause exponential blow-up.
  3. Avoid Quantified Overlapping Adjacencies like \d+\d+. This can cause polynomial blow-up.

Additional resources

I wrote this paper on super-linear regexes. It includes loads of references to other regex-related research.

Add rules of thumb
Source Link
James Davis
  • 988
  • 8
  • 13

Detecting evil regexes

I am aware of two open-source tools trying to answer this questionTry Nicolaas Weideman's RegexStaticAnalysis project.

Rules of thumb

rxxr2 is an academic project. safe-refex is an open-source project Evil regexes are always due to ambiguity in the corresponding NFA, which you can visualize with tools like regexper.

NeitherHere are some forms of these tools is accurate, and each identifies truly vulnerable regular expressions the other does notambiguity. I give some hints about thisDon't use these in my recent paperyour regexes.

  1. Nesting quantifiers like (a+)+ (aka "star height > 1"). This can cause exponential blow-up. See substack's safe-regex tool.
  2. Quantified Overlapping Disjunctions like (a|a)+. This can cause exponential blow-up.
  3. Avoid Quantified Overlapping Adjacencies like \d+\d+. This can cause polynomial blow-up.

I am aware of two open-source tools trying to answer this question.

rxxr2 is an academic project. safe-refex is an open-source project.

Neither of these tools is accurate, and each identifies truly vulnerable regular expressions the other does not. I give some hints about this in my recent paper.

Detecting evil regexes

Try Nicolaas Weideman's RegexStaticAnalysis project.

Rules of thumb

Evil regexes are always due to ambiguity in the corresponding NFA, which you can visualize with tools like regexper.

Here are some forms of ambiguity. Don't use these in your regexes.

  1. Nesting quantifiers like (a+)+ (aka "star height > 1"). This can cause exponential blow-up. See substack's safe-regex tool.
  2. Quantified Overlapping Disjunctions like (a|a)+. This can cause exponential blow-up.
  3. Avoid Quantified Overlapping Adjacencies like \d+\d+. This can cause polynomial blow-up.
Source Link
James Davis
  • 988
  • 8
  • 13

I am aware of two open-source tools trying to answer this question.

rxxr2 is an academic project. safe-refex is an open-source project.

Neither of these tools is accurate, and each identifies truly vulnerable regular expressions the other does not. I give some hints about this in my recent paper.