-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
Milestone
Description
Code to replicate error:
import pandas as pd s = pd.Series(["a13a23", "b13", "c13"], index=["A", "B", "C"]) s.str.extractall("[ab](\d\d)") Note that the regex [ab](\d) from the documentation page works, whereas [ab](\d\d) above doesn't. It seems that any captured group with a length of > 1 causes this error.
Though playing with this a bit more, the following regex's all seem to work correctly without error:
([ab])(\d\d) ()[ab](\d+) (a13)(\d\d) I've reproduced the issue in both versions 0.18.0 and 0.18.1. I'll admit I've not checked against the master branch though.
Note: I posted this to the mailing list, but haven't had any responses - thus I assume this is a bug.
I'm unsure what the underlying cause is here (maybe it doesn't like the first regex character not being within a capture group?).