0

I've extracted data via a API and the source system has created a new field. I'm trying to concatenate the 2 items below. There are more fields than this but this gets the point across. How do i merge the follow data sets?

The field names will always be the same but there might be additional columns and columns removed in the future.

Audit_ID Start Time End Time 1 02/09/2019 05:00 02/09/2019 10:45 

new data

Audit_ID Start Time End Time Shift 2 03/09/2019 03:00 03/09/2019 10:45 Afters 

This is what i want it to look like :

Audit_ID Start Time End Time Shift 1 02/09/2019 05:00 02/09/2019 10:45 2 03/09/2019 03:00 03/09/2019 10:45 Afters 

When i run the code :

joined_rows = pd.concat(data1 , data2], axis=0

This gives error :

joined_rows = pd.concat([data1, data2]) AssertionError Traceback (most recent call last) <ipython-input-46-469b3f9d61b5> in <module>() ----> 1 joined_rows = pd.concat(data1 , data2], axis=0) 2 joined_rows C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy) 256 ) 257 --> 258 return op.get_result() 259 260 C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in get_result(self) 471 472 new_data = concatenate_block_managers( --> 473 mgrs_indexers, self.new_axes, concat_axis=self.axis, copy=self.copy 474 ) 475 if not self.copy: C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in concatenate_block_managers(mgrs_indexers, axes, concat_axis, copy) 2057 blocks.append(b) 2058 -> 2059 return BlockManager(blocks, axes) C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in __init__(self, blocks, axes, do_integrity_check) 141 142 if do_integrity_check: --> 143 self._verify_integrity() 144 145 self._consolidate_check() C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in _verify_integrity(self) 348 "Number of manager items must equal union of " 349 "block items\n# manager items: {0}, # " --> 350 "tot_items: {1}".format(len(self.items), tot_items) 351 ) 352 AssertionError: Number of manager items must equal union of block items # manager items: 44, # tot_items: 48 

Any help appreciated

7
  • Is possible some columns names are duplicated? Commented Oct 18, 2019 at 10:49
  • 1
    Add an empty column named Shift in data1 then apply this concatenation. because both the dataframes should have same number of columns Commented Oct 18, 2019 at 10:49
  • @VidyaSekar - No, you are wrong. Commented Oct 18, 2019 at 11:04
  • yes.. tested.. easy option is there.. Commented Oct 18, 2019 at 11:04
  • What is your version of pandas? Commented Oct 18, 2019 at 11:10

1 Answer 1

0

sample code:

dict1 = {'Audit_ID':['1'],'Start_time':['02/09/2019 05:00'],'End_Time':['02/09/2019 10:45']} dict2 = {'Audit_ID':['2'],'Start_time':['03/09/2019 05:00'],'End_Time':['03/09/2019 10:45'], 'shift':['Afters']} df1 = pd.DataFrame.from_dict(dict1) df2 = pd.DataFrame.from_dict(dict2) # above code is to create your dataset res = pd.concat([df1, df2], axis=0) print(res) 

Result :

 Audit_ID End_Time Start_time shift 0 1 02/09/2019 10:45 02/09/2019 05:00 NaN 0 2 03/09/2019 10:45 03/09/2019 05:00 Afters 
Sign up to request clarification or add additional context in comments.

4 Comments

It is same like res = pd.concat([df1, df2]), so your solution not help
based on the result, I came to this conclusion, kindly explain what went wrong here
Check dupe, there is explanation
the code you gave worked so obviously something else is going on to what i thought. On my dataset ive used the same code and it gives the same error. I've included more detail in the error log to see if this helps

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.