Skip to content
Closed
Prev Previous commit
Next Next commit
Replace object block
  • Loading branch information
dsaxton committed Jun 20, 2020
commit f45017f401d237c95dc39c466a7cfef0decce900
10 changes: 7 additions & 3 deletions pandas/core/internals/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -688,9 +688,13 @@ def replace(
# retry
if not self._can_hold_element(to_replace):
if not isinstance(to_replace, list):
if inplace:
return [self]
return [self.copy()]
return self.astype(object).replace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only want to do this if these are different types, not generally. please show asv's for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the if not self._can_hold_element(to_replace) check is making sure we have different types.

Not seeing any performance regressions from the change (at least in the replace benchmarks):

(pandas-dev) > asv continuous -f 1.1 upstream/master replace-diff-dtype -b ^replace · Creating environments · Discovering benchmarks ·· Uninstalling from conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt. ·· Installing 1e014777 <replace-diff-dtype> into conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.. · Running 12 total benchmarks (2 commits * 1 environments * 6 benchmarks) [ 0.00%] · For pandas commit 9114b4bc <master> (round 1/2): [ 0.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt... [ 0.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 4.17%] ··· Running (replace.Convert.time_replace--).... [ 20.83%] ··· Running (replace.ReplaceList.time_replace_list--).. [ 25.00%] · For pandas commit 1e014777 <replace-diff-dtype> (round 1/2): [ 25.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt... [ 25.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 29.17%] ··· Running (replace.Convert.time_replace--).... [ 45.83%] ··· Running (replace.ReplaceList.time_replace_list--).. [ 50.00%] · For pandas commit 1e014777 <replace-diff-dtype> (round 2/2): [ 50.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 54.17%] ··· replace.Convert.time_replace ok [ 54.17%] ··· ============= ============ ============ -- replace_data ------------- ------------------------- constructor Timestamp Timedelta ============= ============ ============ DataFrame 98.6±3ms 95.2±0.5ms Series 74.0±0.3ms 74.6±1ms ============= ============ ============ [ 58.33%] ··· replace.FillNa.time_fillna ok [ 58.33%] ··· ========= ============= inplace --------- ------------- True 1.04±0.01ms False 3.01±0.3ms ========= ============= [ 62.50%] ··· replace.FillNa.time_replace ok [ 62.50%] ··· ========= ============= inplace --------- ------------- True 1.10±0.02ms False 4.02±0.07ms ========= ============= [ 66.67%] ··· replace.ReplaceDict.time_replace_series ok [ 66.67%] ··· ========= ============ inplace --------- ------------ True 4.27±0.02s False 4.24±0.04s ========= ============ [ 70.83%] ··· replace.ReplaceList.time_replace_list ok [ 70.83%] ··· ========= ============ inplace --------- ------------ True 51.6±0.6μs False 405±5ms ========= ============ [ 75.00%] ··· replace.ReplaceList.time_replace_list_one_match ok [ 75.00%] ··· ========= ========== inplace --------- ---------- True 89.2±3ms False 467±4ms ========= ========== [ 75.00%] · For pandas commit 9114b4bc <master> (round 2/2): [ 75.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt... [ 75.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt [ 79.17%] ··· replace.Convert.time_replace ok [ 79.17%] ··· ============= =========== =========== -- replace_data ------------- ----------------------- constructor Timestamp Timedelta ============= =========== =========== DataFrame 102±4ms 103±3ms Series 80.0±4ms 83.1±4ms ============= =========== =========== [ 83.33%] ··· replace.FillNa.time_fillna ok [ 83.33%] ··· ========= ============= inplace --------- ------------- True 1.10±0.04ms False 3.04±0.1ms ========= ============= [ 87.50%] ··· replace.FillNa.time_replace ok [ 87.50%] ··· ========= ============= inplace --------- ------------- True 1.20±0.04ms False 4.26±0.2ms ========= ============= [ 91.67%] ··· replace.ReplaceDict.time_replace_series ok [ 91.67%] ··· ========= ============ inplace --------- ------------ True 4.26±0.01s False 4.24±0.01s ========= ============ [ 95.83%] ··· replace.ReplaceList.time_replace_list ok [ 95.83%] ··· ========= ========== inplace --------- ---------- True 49.6±1μs False 412±6ms ========= ========== [100.00%] ··· replace.ReplaceList.time_replace_list_one_match ok [100.00%] ··· ========= ========== inplace --------- ---------- True 89.3±2ms False 464±30ms ========= ========== BENCHMARKS NOT SIGNIFICANTLY CHANGED. 
to_replace=to_replace,
value=value,
inplace=inplace,
regex=regex,
convert=convert,
)

to_replace = [x for x in to_replace if self._can_hold_element(x)]
if not len(to_replace):
Expand Down