1

Say I have one dataframe that looks like this

d1 = {'Location+Type':['Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500'], 'Year':['2009','2009','2009'], 'state':['Alabama','Alabama','Alabama'], 'Census_tract':[3,3,3], 'County_name':['Jefferson County','Jefferson County','Jefferson County'], 'A':[1,2,3], 'B':[4,5,6], 'C':[7,8,9], } df1 = pd.DataFrame(data=d1) 

and I have another dataframe that looks like this

d2 = {'Location+Type':['Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500'], 'Year':['2010','2010','2010'], 'state':['Alabama','Alabama','Alabama'], 'Census_tract':[3,3,3], 'County_name':['Jefferson County','Jefferson County','Jefferson County'], 'A':[11,22,33], 'B':[44,55,66], 'C':[77,88,99], 'D': [10,20,30] } df2 = pd.DataFrame(data=d2) 

I want to merge these dataframes together such that I would have this dataframe

d3 = {'Location+Type':['Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500', 'Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500'], 'Year':['2009','2009','2009','2010','2010','2010'], 'state':['Alabama','Alabama','Alabama','Alabama','Alabama','Alabama'], 'Census_tract':[3,3,3,3,3,3], 'County_name':['Jefferson County','Jefferson County','Jefferson County', 'Jefferson County','Jefferson County','Jefferson County'] 'A':[1,2,3,11,22,33], 'B':[4,5,6,44,55,66], 'C':[7,8,9,77,88,99], 'D':[None,None,None,10,20,30] } df_final = pd.DataFrame(data=d3) 

I know there are different join operations but this one is sort of a join and stack. Thus I am not sure how to achieve the latter result. Any suggestions are greatly appreciated.

1 Answer 1

1

You can try pd.concat:

df = pd.concat([df1, df2]).drop(columns=["County_name"]).reset_index(drop=True) print(df) 

Prints:

 Location+Type Year state Census_tract A B C D 0 Census Tract 3, Jefferson County, Alabama: Sum... 2009 Alabama 3 1 4 7 NaN 1 Census Tract 4, Jefferson County, Alabama: Sum... 2009 Alabama 3 2 5 8 NaN 2 Census Tract 5, Jefferson County, Alabama: Sum... 2009 Alabama 3 3 6 9 NaN 3 Census Tract 3, Jefferson County, Alabama: Sum... 2009 Alabama 3 11 44 77 10.0 4 Census Tract 4, Jefferson County, Alabama: Sum... 2009 Alabama 3 22 55 88 20.0 5 Census Tract 5, Jefferson County, Alabama: Sum... 2009 Alabama 3 33 66 99 30.0 
Sign up to request clarification or add additional context in comments.

2 Comments

Apologies the county name should still be there.
@justanewb Then omit the .drop(columns=["County_name"]) part and the County_name will be there.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.