Skip to content

erbsland-dev/erbsland-cpp-re

Erbsland Regular Expression Library

The Erbsland Regular Expression Library is a secure and reliable regular expression engine for modern C++.
It is designed to be lightweight, dependency-free, and predictable, while offering solid UTF-8 and Unicode support out of the box.

You can embed the library directly into your project to provide regular expression matching without pulling in large external dependencies. The pattern syntax is inspired by a pragmatic mix of PCRE and Python regular expressions, focusing on clarity, safety, and maintainability.

Internally, the engine implements a carefully optimized variant of Thompson’s NFA algorithm, adapted for modern C++ and robust execution under strict resource limits.

Feature Overview

  • Zero dependencies – easy to embed and audit.
  • Strong focus on security and reliability.
  • Full UTF-8 support with strict input validation.
  • Rich regular expression syntax:
    • Greedy, lazy, and possessive quantifiers
    • Atomic groups
  • Optional syntax compatibility with PCRE, Python, and other popular regex engines.
  • Built-in time and memory limits with safe defaults.
  • Solid Unicode support without relying on ICU:
    • Full Unicode character classes
    • Case-insensitive matching using simple case folding
  • Configurable string type support:
    • std::string or std::u8string (selected at build time)
  • Human-readable error messages when parsing regular expressions.
  • Rich API for:
    • Finding first or all matches
    • Replacing text using placeholders
  • Efficient coroutine-based matching.
  • String-view-based processing with no unnecessary allocations.
  • Abstract input interface for custom or streaming input sources.
  • Diagnostic Tools:
    • Disassembler to display the generated code for any pattern.
    • Assembler to write custom regex engines.

Non-Features and Design Goals

This library intentionally avoids certain features to remain predictable, secure, and memory-efficient:

  • To keep the memory footprint low and matching deterministic:
    • No Unicode normalization
    • No multi-character case folding
    • No Unicode character names
  • The primary goal is security, not maximum throughput:
    • Matching is efficient, but not intended for workloads where regex performance is the main bottleneck.
    • Regular matching is fast, but due to almost no program optimizations, strict validation of the input, and the design of the NFA algorithm, not as fast as RE2 or PCRE.
  • Some advanced constructs are deliberately not supported:
    • No backreferences
    • No lookahead assertions
    • No conditional patterns

Project Status

  • βœ” Stable and suitable for production use (for UTF-8 based strings).
  • βœ” Public API is stable
  • βœ” Tested on:
    • Linux (GCC)
    • macOS (Clang)
    • Windows (MSVC)
  • βœ— UTF-16 and UTF-32 support:
    • Implemented but not fully tested. Use it at your own risk.
    • Not documented.

Quick Start

#include <iostream> #include <string> #include <el/re/regex.hpp> using namespace el::re; int main() { try { auto re = RegEx::compile(R"(\d+)"); auto text = std::string{"abc 12345 xyz"}; if (auto match = re->findFirst(text); match != nullptr) { std::cout << "Found a number: " << match->content(0) << "\n"; } } catch (const Error &error) { std::cerr << error.what() << "\n"; return 1; } return 0; }

Performance Comparison

Direct performance comparisons between regular expression engines are often misleading.
The following benchmarks are provided only as a rough indication of performance characteristics.

All benchmarks were run on a 2021 MacBook Pro with an Apple M1 Max CPU and ample memory.

Important notes:

  • The Erbsland Regular Expression Library:
    • Always reads and validates UTF-8 input
    • Performs Unicode-aware comparisons in all modes
    • The compiled program from the pattern is not optimized for performance
    • The engine has almost no speed optimizations
  • Neither PCRE nor std::regex enforce strict UTF-8 validation (for this benchmark).
  • β€œASCII mode” in Erbsland RE only affects character class handling; input is still processed as Unicode.

Benchmark Results

These are the results for version 1.0.0 of the library.

Benchmarking file: war_and_peace.txt (3.20 MB) β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Pattern β”‚ Library β”‚ Mode β”‚ Time (ms) β”‚ % β”‚ Bar β”‚ Matches β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Words β”‚ erbsland-re β”‚ Unicode β”‚ 145.751 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 576584 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 66.280 β”‚ 45 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–Œ β”‚ 576584 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 144.482 β”‚ 99 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 586871 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 232.292 β”‚ 159 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 586871 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 45.934 β”‚ 32 β”‚ β–ˆβ–ˆβ–ˆ β”‚ 586871 β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Capitalized β”‚ erbsland-re β”‚ Unicode β”‚ 86.961 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 50038 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 8.242 β”‚ 9 β”‚ β–ˆ β”‚ 50038 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 86.307 β”‚ 99 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 60182 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 215.900 β”‚ 248 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ί β”‚ 60182 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 7.872 β”‚ 9 β”‚ β–ˆ β”‚ 60182 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ 
Benchmarking file: shakespeare.html (6.98 MB) β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Pattern β”‚ Library β”‚ Mode β”‚ Time (ms) β”‚ % β”‚ Bar β”‚ Matches β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Words β”‚ erbsland-re β”‚ Unicode β”‚ 310.203 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 1301628 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 166.467 β”‚ 54 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ β”‚ 1301628 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 309.185 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 1301773 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 521.527 β”‚ 168 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 1301773 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 100.895 β”‚ 33 β”‚ β–ˆβ–ˆβ–ˆβ–Œ β”‚ 1301773 β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Capitalized β”‚ erbsland-re β”‚ Unicode β”‚ 205.873 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 184903 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 29.793 β”‚ 14 β”‚ β–ˆβ–Œ β”‚ 184903 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 217.043 β”‚ 105 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ β”‚ 185027 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 493.012 β”‚ 239 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ί β”‚ 185027 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 27.376 β”‚ 13 β”‚ β–ˆβ–Œ β”‚ 185027 β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ URI β”‚ erbsland-re β”‚ Unicode β”‚ 146.271 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 10 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 8.598 β”‚ 6 β”‚ β–Œ β”‚ 10 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 146.246 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 10 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 409.767 β”‚ 280 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ί β”‚ 10 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 8.365 β”‚ 6 β”‚ β–Œ β”‚ 10 β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ ExtractTocLinks β”‚ erbsland-re β”‚ Unicode β”‚ 432.889 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 44 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 6.351 β”‚ 1 β”‚ β”‚ 44 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 432.571 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 44 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 557.580 β”‚ 129 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 44 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 6.175 β”‚ 1 β”‚ β”‚ 44 β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ ExtractLicenseDiv β”‚ erbsland-re β”‚ Unicode β”‚ 432.680 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 1 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 6.412 β”‚ 1 β”‚ β”‚ 1 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 432.290 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 1 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 543.849 β”‚ 126 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ β”‚ 1 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 6.205 β”‚ 1 β”‚ β”‚ 1 β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ HTML Tags β”‚ erbsland-re β”‚ Unicode β”‚ 195.751 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 159584 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 18.203 β”‚ 9 β”‚ β–ˆ β”‚ 159584 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 190.732 β”‚ 97 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ β”‚ 159584 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 416.028 β”‚ 213 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ί β”‚ 159584 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 18.031 β”‚ 9 β”‚ β–ˆ β”‚ 159584 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ 
Benchmarking file: shakespeare.txt (5.38 MB) β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Pattern β”‚ Library β”‚ Mode β”‚ Time (ms) β”‚ % β”‚ Bar β”‚ Matches β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Words β”‚ erbsland-re β”‚ Unicode β”‚ 247.719 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 996052 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 119.407 β”‚ 48 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 996052 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 242.508 β”‚ 98 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 996199 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 403.254 β”‚ 163 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ β”‚ 996199 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 79.064 β”‚ 32 β”‚ β–ˆβ–ˆβ–ˆ β”‚ 996199 β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Capitalized β”‚ erbsland-re β”‚ Unicode β”‚ 212.670 β”‚ 100 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 180019 β”‚ β”‚ β”‚ pcre2 β”‚ Unicode β”‚ 27.846 β”‚ 13 β”‚ β–ˆβ–Œ β”‚ 180019 β”‚ β”‚ β”‚ erbsland-re β”‚ Ascii β”‚ 162.808 β”‚ 77 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ β”‚ 180140 β”‚ β”‚ β”‚ std::regex β”‚ Ascii β”‚ 380.359 β”‚ 179 β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ 180140 β”‚ β”‚ β”‚ pcre2 β”‚ Ascii β”‚ 26.088 β”‚ 12 β”‚ β–ˆ β”‚ 180140 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ 

Pattern Legend

  • Words
    \w+
  • Capitalized
    \b[A-Z][a-z]*\b
  • URI
    https?://[a-zA-Z0-9\.]+
  • ExtractTocLinks
    <a href="#(chap([0-9]{2}))" class="pginternal">([^<]+)</a>
  • ExtractLicenseDiv
    <div id="(([^-\"]+)-([^-"]+)-([^"]+))">([^<]+)</div>
  • HTML Tags
    <[a-z1-6]+[^>]*>

Requirements

  • A C++20-compliant compiler:
    • Clang
    • GCC
    • MSVC
  • CMake 3.23 or newer

License

Copyright Β© 2026 Tobias Erbsland
https://erbsland.dev/

Licensed under the Apache License, Version 2.0.
You may obtain a copy at:
http://www.apache.org/licenses/LICENSE-2.0

Distributed on an β€œAS IS” basis, without warranties or conditions of any kind.
See the LICENSE file for full details.

About

A secure and predictable regular expression library for modern C++. Dependency-free, UTF-8 aware, and built on a Thompson NFA engine with strict limits, clear APIs, streaming input support, and diagnostics. Designed for reliability over raw speed.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

Contributors