- Notifications
You must be signed in to change notification settings - Fork 0
regex cheat sheet
Marcel Schmalzl edited this page Mar 11, 2025 · 6 revisions
- Escape special characters with a prepending
\ - Greediness
- Regexes are per se greedy; meaning as many as possible characters will be matched while still satisfying the regex pattern
- Appending
?to quantifiers results in non-greediness
| Pattern | Description |
|---|---|
. | Any character |
^ | Beginning of line |
$ | EOL |
[a-c8] | Characters a, b, c OR 8 |
[^chars] | Any character except c, h, a, r, s |
( ) | Capture group |
( a ( b ) c ) | Nested capture group >> \1 = abc; \2 = b |
( a )? | Optional capture group; Abc?a Matches Abc and Abca |
| Pattern | Description |
|---|---|
* | 0 or more |
+ | 1 or more |
{N} | N occurences |
{N, M} | M to N occurences If omitted: N = 0; M = inf. |
{N, M}? | M to N as few as possible |
? | 0 OR 1 |
| Pattern | Description |
|---|---|
\w | [a-zA-Z0-9_] (alphanumeric) |
\W | [^a-zA-Z0-9_] (non-alphanumeric) |
\d | [0-9] (digit) |
\D | [^0-9] (non-digit) |
\b | Empty string (@ word boundary (between \w and \W)) |
\B | Empty string (not at word boundary) |
\s | [\t\n\r\f\v] (whitespace) |
\S | [^\t\n\r\f\v] (non-whitespace) |
\A | Beginning of string |
\Z | End of string |
\g<id> | Previously defined group |
R|S | Regex R OR S |
| Pattern | Description |
|---|---|
(?:...) | Non-capturing group (match but do not use) |
(?\<name>A) | Define named group; A = Regex, <name> = callable name |
(?P\<name>A) | Same as before; first does not always work |
(?P...) | Match any named group |
(?#...) | Comment (use for documentation) |
(?=...) | Lookahead; matches without consuming |
(?!...) | Negative lookahead |
(?<=...) | Lookbehind; matches without consuming |
(?<!...) | Negative lookbehind |
(?(A)B|C) | 'B' if A matched, else 'B' |
You can easily combine multiple look(ahead|behind|...) as an AND since they are not consuming any characters. If you want to match product and development in any order but both must appear: ^(?=.*product)(?=.*development).*$.
| Pattern | Description |
|---|---|
\1, \2, ... \n | Backreference; Get match of n-th capturing group |
You can even backreference capture groups in find and use them in replace. In some IDEs backreferencing differs:
- PyCharm:
$ninstead of\n - Notepad++:
\n
re.compile()re.search()-
match.groups()ormatch.group(<group_name>)
import re # "Normal" synthax pattModuleSummary = re.compile(r"[0-9a-f]{8}") # Matches 8 chars long hex numbers # Find and print matches for line in lines: match = re.search(pattModuleSummary, line) # Check if we have at least one match if match: # Print matched groups print(match.groups())Comment + multiline synthax (ignores whitespaces and (python) comments):
import re pattModuleSummary = re.compile(r""" ([0-9a-f]{8}) # Origin (?:\+{1})([0-9a-f]{8}) # Size """, re.X) # <-- re.X is important!! # Find and print matches for line in lines: match = re.search(pattModuleSummary, line) # Check if we have at least one match if match: # Print matched groups print(match.groups())re.X is neccesary if you want to use the multiline re.compile synthax.
import re pattern1 = re.compile('^(?P<addr>[0-9a-f]{8,16})\+(?P<size>[0-9a-f]{8,})$') match = pattern1.search(line) match.group('addr') # References only the group `addr`
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License *.
Code (snippets) are licensed under a MIT License *.
* Unless stated otherwise