-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
str.replace('.','') should replace every character? (fix) #24809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c984582 67b0870 340ae89 5a4e131 cf8dc79 924ecc8 97bf73a da16172 93a0715 File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| | @@ -1008,6 +1008,21 @@ def test_replace(self): | |
| values = klass(data) | ||
| pytest.raises(TypeError, values.str.replace, 'a', repl) | ||
| | ||
| # GH 24804 | ||
| def test_replace_single_pattern(self): | ||
| values = Series(['abc', '123']) | ||
| | ||
| result = values.str.replace('.', 'foo', regex=True) | ||
| expected = Series(['foofoofoo', 'foofoofoo']) | ||
| tm.assert_series_equal(result, expected) | ||
| | ||
| def test_replace_without_specifying_regex_parameter(self): | ||
| values = Series(['a.c']) | ||
| | ||
| result = values.str.replace('.', 'b') | ||
| expected = Series(['abc']) | ||
| tm.assert_series_equal(result, expected) | ||
| | ||
| def test_replace_callable(self): | ||
| # GH 15055 | ||
| values = Series(['fooBAD__barBAD', NA]) | ||
| | @@ -2924,7 +2939,7 @@ def test_pipe_failures(self): | |
| | ||
| tm.assert_series_equal(result, exp) | ||
| | ||
| result = s.str.replace('|', ' ') | ||
| result = s.str.replace('|', ' ', regex=False) | ||
| Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm Ok. I think this is correct but arguably an API breaking change so make sure we make note of that in the whatsnew Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how is this an api change? regex=True is the default Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was just thinking of this particular instance and any where a user was passing in a single character that may have special meaning with a regex. This previously would directly replace a pipe but now requires Being extra conservative but not tied to the request then if you feel its over communicating. Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly, but if you think this documentation is not necessary, let me know and I can change. Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. This is a potentially disruptive change... Can we:
Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK I agree - that's probably the best go-forward path, save the first point which I don't understand. @alarangeiras can you raise a FutureWarning here instead? Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Changing the default If the user passes If the user passes We'll use regex=None to see if the user is explicit or not. Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not just warn when the length of the pattern is 1 and Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Then I think there would be no way to have In [5]: pd.Series(['a.c']).str.replace('.', 'b') # Warning: Interpreting '.' as a literal, not a regex... The default will change in the future. Out[5]: 0 abc dtype: object# no warning In [5]: pd.Series(['a.c']).str.replace('.', 'b', regex=True) Out[5]: 0 bbb dtype: objectunless I"m missing something. Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following this line of reasoning, from what I understand, every bug found should issue a warning of future adjustment? | ||
| exp = Series(['A B C']) | ||
| | ||
| tm.assert_series_equal(result, exp) | ||
| | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not needed, this is a simple bug fix