Skip to content

Conversation

@dsaxton
Copy link
Contributor

@dsaxton dsaxton commented Jun 20, 2020

Bug noticed when looking at a different issue #34871 (comment)

@gfyoung gfyoung added Bug DataFrame DataFrame data structure Series Series data structure labels Jun 20, 2020
if inplace:
return [self]
return [self.copy()]
return self.astype(object).replace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only want to do this if these are different types, not generally. please show asv's for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the if not self._can_hold_element(to_replace) check is making sure we have different types.

Not seeing any performance regressions from the change (at least in the replace benchmarks):

(pandas-dev) > asv continuous -f 1.1 upstream/master replace-diff-dtype -b ^replace · Creating environments · Discovering benchmarks ·· Uninstalling from conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt. ·· Installing 1e014777 <replace-diff-dtype> into conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.. · Running 12 total benchmarks (2 commits * 1 environments * 6 benchmarks) [ 0.00%] · For pandas commit 9114b4bc <master> (round 1/2): [ 0.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt... [ 0.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 4.17%] ··· Running (replace.Convert.time_replace--).... [ 20.83%] ··· Running (replace.ReplaceList.time_replace_list--).. [ 25.00%] · For pandas commit 1e014777 <replace-diff-dtype> (round 1/2): [ 25.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt... [ 25.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 29.17%] ··· Running (replace.Convert.time_replace--).... [ 45.83%] ··· Running (replace.ReplaceList.time_replace_list--).. [ 50.00%] · For pandas commit 1e014777 <replace-diff-dtype> (round 2/2): [ 50.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 54.17%] ··· replace.Convert.time_replace ok [ 54.17%] ··· ============= ============ ============ -- replace_data ------------- ------------------------- constructor Timestamp Timedelta ============= ============ ============ DataFrame 98.6±3ms 95.2±0.5ms Series 74.0±0.3ms 74.6±1ms ============= ============ ============ [ 58.33%] ··· replace.FillNa.time_fillna ok [ 58.33%] ··· ========= ============= inplace --------- ------------- True 1.04±0.01ms False 3.01±0.3ms ========= ============= [ 62.50%] ··· replace.FillNa.time_replace ok [ 62.50%] ··· ========= ============= inplace --------- ------------- True 1.10±0.02ms False 4.02±0.07ms ========= ============= [ 66.67%] ··· replace.ReplaceDict.time_replace_series ok [ 66.67%] ··· ========= ============ inplace --------- ------------ True 4.27±0.02s False 4.24±0.04s ========= ============ [ 70.83%] ··· replace.ReplaceList.time_replace_list ok [ 70.83%] ··· ========= ============ inplace --------- ------------ True 51.6±0.6μs False 405±5ms ========= ============ [ 75.00%] ··· replace.ReplaceList.time_replace_list_one_match ok [ 75.00%] ··· ========= ========== inplace --------- ---------- True 89.2±3ms False 467±4ms ========= ========== [ 75.00%] · For pandas commit 9114b4bc <master> (round 2/2): [ 75.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt... [ 75.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 79.17%] ··· replace.Convert.time_replace ok [ 79.17%] ··· ============= =========== =========== -- replace_data ------------- ----------------------- constructor Timestamp Timedelta ============= =========== =========== DataFrame 102±4ms 103±3ms Series 80.0±4ms 83.1±4ms ============= =========== =========== [ 83.33%] ··· replace.FillNa.time_fillna ok [ 83.33%] ··· ========= ============= inplace --------- ------------- True 1.10±0.04ms False 3.04±0.1ms ========= ============= [ 87.50%] ··· replace.FillNa.time_replace ok [ 87.50%] ··· ========= ============= inplace --------- ------------- True 1.20±0.04ms False 4.26±0.2ms ========= ============= [ 91.67%] ··· replace.ReplaceDict.time_replace_series ok [ 91.67%] ··· ========= ============ inplace --------- ------------ True 4.26±0.01s False 4.24±0.01s ========= ============ [ 95.83%] ··· replace.ReplaceList.time_replace_list ok [ 95.83%] ··· ========= ========== inplace --------- ---------- True 49.6±1μs False 412±6ms ========= ========== [100.00%] ··· replace.ReplaceList.time_replace_list_one_match ok [100.00%] ··· ========= ========== inplace --------- ---------- True 89.3±2ms False 464±30ms ========= ========== BENCHMARKS NOT SIGNIFICANTLY CHANGED. 
@jreback
Copy link
Contributor

jreback commented Jul 17, 2020

can u merge master and will look

@simonjayhawkins
Copy link
Member

@dsaxton can you move release note

@simonjayhawkins
Copy link
Member

marking as stale 😉


**Reshaping**

- Bug where :meth:`DataFrame.replace` and :meth:`Series.replace` would fail when replacement value was of a different dtype but equal to values in the original object (:issue:`34871`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move to 1.2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks similar to regression reported in #35376 with open PR (with different fix) #36444

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it's the same issue. If another fix avoids astyping to object I would think it's preferred.

@jreback jreback added this to the 1.2 milestone Sep 19, 2020
@jreback jreback removed the Stale label Sep 19, 2020
@dsaxton
Copy link
Contributor Author

dsaxton commented Sep 19, 2020

Closing since the other PR looks like a better fix

@dsaxton dsaxton closed this Sep 19, 2020
@dsaxton dsaxton deleted the replace-diff-dtype branch September 19, 2020 14:04
@QuentinN42 QuentinN42 mentioned this pull request Sep 21, 2020
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug DataFrame DataFrame data structure Series Series data structure

4 participants