0

I have two dataframes. dataframe_a:

 data | location_zone | test_hour | analysis_date ------------+---------------+-----------+------------------------ 10 | america | 12 | 2000-1-1 11 | america | 13 | 2000-1-2 21 | china | 14 | 2000-1-3 

and dataframe_b:

 data | location_zone | test_hour | analysis_date ------------+---------------+-----------+------------------------ 1 | china | 14 | 2000-1-3 2 | america | 13 | 2000-1-2 3 | america | 12 | 2000-1-1 

And I need to combine these dataframes on corresponding location_zone, test_hour, and analysis_date, and add the data columns. The final result should be:

 data | location_zone | test_hour | analysis_date ------------+---------------+-----------+------------------------ 13 | america | 12 | 2000-1-1 13 | america | 13 | 2000-1-2 22 | china | 14 | 2000-1-3 
1
  • please check my answer:) Commented Oct 18, 2019 at 9:51

2 Answers 2

1

You could do concat + groupby:

df = pd.concat([dataframe_a, dataframe_b]).groupby(['location_zone','test_hour','analysis_date'], as_index=False)['data'].sum() print(df) 

Output

 location_zone test_hour analysis_date data 0 america 12 2000-1-1 13 1 america 13 2000-1-2 13 2 china 14 2000-1-3 22 
Sign up to request clarification or add additional context in comments.

Comments

0

Since both df have the same columns names, you may rename one data column to different name and let merge handle the rest. Next, call eval and slicing on original columns

df_final = (df_a.merge(df_b.rename(columns={'data': 'data_b'})) .eval('data=data + data_b')[df_a.columns]) Out[20]: data location_zone test_hour analysis_date 0 13 america 12 2000-1-1 1 13 america 13 2000-1-2 2 22 china 14 2000-1-3 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.