9

I'm having trouble getting NSRegularExpression to match patterns on strings with wider (?) Unicode characters in them. It looks like the problem is the range parameter -- Swift counts individual Unicode characters, while Objective-C treats strings as if they're made up of UTF-16 code units.

Here is my test string and two regular expressions:

let str = "dog🐶🐮cow" let dogRegex = NSRegularExpression(pattern: "d.g", options: nil, error: nil)! let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)! 

I can match the first regex with no problems:

let dogMatch = dogRegex.firstMatchInString(str, options: nil, range: NSRange(location: 0, length: countElements(str))) println(dogMatch?.range) // (0, 3) 

But the second fails with the same parameters, because the range I send it (0...7) isn't long enough to cover the whole string as far as NSRegularExpression is concerned:

let cowMatch = cowRegex.firstMatchInString(str, options: nil, range: NSRange(location: 0, length: countElements(str))) println(cowMatch.range) // nil 

If I use a different range I can make the match succeed:

let cowMatch2 = cowRegex.firstMatchInString(str, options: nil, range: NSRange(location: 0, length: str.utf16Count)) println(cowMatch2?.range) // (7, 3) 

but then I don't know how to extract the matched text out of the string, since that range falls outside the range of the Swift string.

1
  • 1
    +one for the dogcow reference. Commented Feb 14, 2016 at 18:49

1 Answer 1

10

Turns out you can fight fire with fire. Using the Swift-native string's utf16Count property and the substringWithRange: method of NSString -- not String -- gets the right result. Here's the full working code:

let str = "dog🐶🐮cow" let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)! if let cowMatch = cowRegex.firstMatchInString(str, options: nil, range: NSRange(location: 0, length: str.utf16Count)) { println((str as NSString).substringWithRange(cowMatch.range)) // prints "cow" } 

(I figured this out in the process of writing the question; score one for rubber duck debugging.)

Sign up to request clarification or add additional context in comments.

1 Comment

If you convert let nsstr = str as NSString first then you can simply use length: [nsstr length] as you would in ObjC.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.