Quick Start
Tutorial
Search & Replace
Tools & Languages
Examples
Reference
Regex Reference
Introduction
Table of Contents
Quick Reference
Characters
Basic Features
Character Classes
Shorthands
Anchors
Word Boundaries
Quantifiers
Capturing Groups & Backreferences
Named Groups & Backreferences
Special Groups
Unicode Characters and Properties
Unicode Versions
Unicode Categories
Unicode Scripts
Unicode Blocks
Unicode Binary Properties
Unicode Property Sets
Unicode Boundaries
Mode Modifiers
Recursion & Balancing Groups
Backtracking Control Verbs
Replacement Reference
Characters
Matched Text & Backreferences
Case Conversion
Context
Conditionals
More on This Site
Introduction
Regular Expressions Quick Start
Regular Expressions Tutorial
Replacement Strings Tutorial
Applications and Languages
Regular Expressions Examples
Regular Expressions Reference
Replacement Strings Reference
Book Reviews
Printable PDF
About This Site
RSS Feed & Blog
RegexBuddy—Better than a regular expression reference!

Regular Expression Reference: Unicode Boundaries

FeatureSyntaxDescriptionExampleJGsoft Python JavaScript VBScript XRegExp .NET Java ICU RE2 Perl PCRE PCRE2 PHP Delphi R Ruby std::regex Boost Tcl POSIX GNU Oracle XML XPath
UAX 29 word boundary \b and \B \b matches at a word boundary ÷ and \B matches at a word non-boundary × according to Unicode Standard Annex #29. (?w)\b.\b matches ! because the WB1 and WB2 rules in UAX 29 treat the start and end of the string as a word boundary. nononononononooptionnonononononononononon/an/anon/an/an/a
UAX 29 word boundary \b{wb} Matches at a word boundary ÷ according to Unicode Standard Annex #29. \b{wb}.\b{wb} matches ! because the WB1 and WB2 rules in UAX 29 treat the start and end of the string as a word boundary. nonononononononono5.22nononononononononononononono
UAX 29 word non-boundary \B{wb} Matches at a word non-boundary × according to Unicode Standard Annex #29. \B{wb}'\B{wb} matches ' in John's because the WB6 and WB7 rules in UAX 29 treat the positions before and after a hyphen between two letters as a word non-boundary. nonononononononono5.22nononononononononononononono
UAX 29 grapheme boundary \b{g} Matches at a grapheme boundary ÷ according to Unicode Standard Annex #29. \b{g} matches at the start and the end of the 4 strings à (U+0061 U+0300), คู (U+0E0F U+0E39), अः (U+0905 U+0903), and ガ (U+FF76 U+FF9F), but not in the middle of these strings. nononononono9nono5.22nononononononononononononono
UAX 29 grapheme non-boundary \B{g} Matches at a grapheme non-boundary × according to Unicode Standard Annex #29. \b{g} matches between the 2 code points in each of the 4 strings à (U+0061 U+0300), คู (U+0E0F U+0E39), अः (U+0905 U+0903), and ガ (U+FF76 U+FF9F), but not at the start or end of these strings. nonononononononono5.22nononononononononononononono
UAX 29 grapheme boundary \b{gcb} Matches at a grapheme boundary ÷ according to Unicode Standard Annex #29. \b{gcb} matches at the start and the end of the 4 strings à (U+0061 U+0300), คู (U+0E0F U+0E39), अः (U+0905 U+0903), and ガ (U+FF76 U+FF9F), but not in the middle of these strings. nonononononononono5.22nononononononononononononono
UAX 29 grapheme non-boundary \B{gcb} Matches at a grapheme non-boundary × according to Unicode Standard Annex #29. \b{gcb} matches between the 2 code points in each of the 4 strings à (U+0061 U+0300), คู (U+0E0F U+0E39), अः (U+0905 U+0903), and ガ (U+FF76 U+FF9F), but not at the start or end of these strings. nonononononononono5.22nononononononononononononono
UAX 29 sentence boundary \b{sb} Matches at a sentence boundary ÷ according to Unicode Standard Annex #29. \b{sb}.+?\b{sb} matches One! and then Two! in One! Two! nonononononononono5.22nononononononononononononono
UAX 29 sentence non-boundary \B{sb} Matches at a sentence non-boundary × according to Unicode Standard Annex #29. nonononononononono5.22nononononononononononononono
UAX 14 line boundary \b{lb} Matches at a line boundary ÷ according to Unicode Standard Annex #14. \b{lb}.+?\b{lb} matches only Two! in One! Two! nonononononononono5.24nononononononononononononono
UAX 14 line non-boundary \B{lb} Matches at a line non-boundary × according to Unicode Standard Annex #14. nonononononononono5.24nononononononononononononono
FeatureSyntaxDescriptionExampleJGsoft Python JavaScript VBScript XRegExp .NET Java ICU RE2 Perl PCRE PCRE2 PHP Delphi R Ruby std::regex Boost Tcl POSIX GNU Oracle XML XPath