1

I have moved a Pandas script I wrote from one computer to another. When running it on the new computer I am getting this error but am unsure what is causing it.

dfm = master_df dfa = pd.read_csv(path) dfa["Size"] = pd.cut(dfa["NOMSIZE_IN_MM_U"],bins=[0,300,600,float('inf')]) dfa["Depth"] = pd.cut(dfa["DEPTH_U"],bins=[0,2,4,6,float('inf')]) dfm['Size'] = pd.cut(dfm['NOMSIZE_IN_MM'], bins = [0,300,600,float('inf')]) dfm['Depth'] = pd.cut(dfm['AVE_DEPTH'], bins = [0,2,4,6,float('inf')]) master_df = dfm.join(dfa.set_index(['Size', 'Depth'])['REPAIR_DURATION'],on=['Size', 'Depth']) 

Returns:

Traceback (most recent call last): File "s:/!AMD Share/Julian D - Student/LARM Gravity/Python Scripts/LARM3_GS.py", line 442, in <module> master_df = dfm.join(dfa.set_index(['Size', 'Depth'])['REPAIR_DURATION'],on=['Size', 'Depth']) File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4767, in join rsuffix=rsuffix, sort=sort) File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4782, in _join_compat suffixes=(lsuffix, rsuffix), sort=sort) File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 54, in merge return op.get_result() File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 569, in get_result join_index, left_indexer, right_indexer = self._get_join_info() File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 726, in _get_join_info sort=self.sort) File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 1353, in _left_join_on_index _get_multiindex_indexer(join_keys, right_ax, sort=sort) File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 1304, in _get_multiindex_indexer rlab, llab, shape = map(list, zip(* map(fkeys, index.levels, join_keys))) File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 1390, in _factorize_keys lk.is_dtype_equal(rk)): AttributeError: 'CategoricalIndex' object has no attribute 'is_dtype_equal' 

Where dfa:

 NOMSIZE_IN_MM_U DEPTH_U REPAIR_DURATION 0 300 2 1 1 300 4 1 2 300 6 2 3 300 8 3 4 600 2 2 5 600 4 2 6 600 6 2 7 600 8 5 8 900 2 4 9 900 4 4 10 900 6 5 11 900 8 10 

Master Data:

 ID AVE_DEPTH NOMSIZE_IN_MM 1 0 3.985 915 2 1 2.655 915 3 2 4.200 915 
0

1 Answer 1

2
  • Code tested in pandas 1.2.1 - update with pip or conda, depending on your environment.
  • Also review Pandas Merging 101 for a full breakdown of merging and joining dataframes.

Sample DataFrames and Setup

import pandas as pd # test dataframes dfm = pd.DataFrame({'ID': [0, 1, 2], 'AVE_DEPTH': [3.985, 2.655, 4.200], 'NOMSIZE_IN_MM': [915, 915, 915]}) dfa = pd.DataFrame({'NOMSIZE_IN_MM_U': [300, 300, 300, 300, 600, 600, 600, 600, 900, 900, 900, 900], 'DEPTH_U': [2, 4, 6, 8, 2, 4, 6, 8, 2, 4, 6, 8], 'REPAIR_DURATION': [1, 1, 2, 3, 2, 2, 2, 5, 4, 4, 5, 10]}) # add bins dfa["Size"] = pd.cut(dfa["NOMSIZE_IN_MM_U"],bins=[0,300,600,float('inf')]) dfa["Depth"] = pd.cut(dfa["DEPTH_U"],bins=[0,2,4,6,float('inf')]) dfm['Size'] = pd.cut(dfm['NOMSIZE_IN_MM'], bins = [0,300,600,float('inf')]) dfm['Depth'] = pd.cut(dfm['AVE_DEPTH'], bins = [0,2,4,6,float('inf')]) # join or merge the dataframes 

.join

  • Combine on indices
# set index - it's better to be explicit dfm.set_index(['Size', 'Depth'], inplace=True) dfa.set_index(['Size', 'Depth'], inplace=True) # join dataframes df = dfm.join(dfa.REPAIR_DURATION) # display(df) ID AVE_DEPTH NOMSIZE_IN_MM REPAIR_DURATION Size Depth (600.0, inf] (2.0, 4.0] 0 3.985 915 4 (2.0, 4.0] 1 2.655 915 4 (4.0, 6.0] 2 4.200 915 5 

.merge

  • Combine on a combination of index and columns
# merge dataframes df = dfm.merge(dfa[['Size', 'Depth', 'REPAIR_DURATION']], on=['Size', 'Depth']) # display(df) ID AVE_DEPTH NOMSIZE_IN_MM Size Depth REPAIR_DURATION 0 0 3.985 915 (600.0, inf] (2.0, 4.0] 4 1 1 2.655 915 (600.0, inf] (2.0, 4.0] 4 2 2 4.200 915 (600.0, inf] (4.0, 6.0] 5 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.