1

This is the dataframe I am working with,

data = pd.DataFrame({'id':[1,2,3], 'start date': ['2001-11-13', '2002-6-26', '2004-2-22'], 'end date': ['2002-6-25', '2004-2-21', '2006-8-21'], 'values':[1000,2000,3000]}) 

I am trying to stack only the date columns, start date and end date, while leaving the id and values columns unstacked. The final dataframe would look like this,

data = pd.DataFrame({'id':[1,1,2,2,3,3], 'date': ['2001-11-13', '2002-6-26','2004-2-22','2002-6-25','2004-2-21', '2006-8-21'], 'values':[1000,1000,2000,2000,3000,3000]}) 

2 Answers 2

2

Just do melt

out = data.melt(['id','values']).sort_values('id') Out[436]: id values variable value 0 1 1000 start date 2001-11-13 3 1 1000 end date 2002-6-25 1 2 2000 start date 2002-6-26 4 2 2000 end date 2004-2-21 2 3 3000 start date 2004-2-22 5 3 3000 end date 2006-8-21 

Or set_index + stack

out = data.set_index(['id','values']).stack().reset_index() Out[437]: id values level_2 0 0 1 1000 start date 2001-11-13 1 1 1000 end date 2002-6-25 2 2 2000 start date 2002-6-26 3 2 2000 end date 2004-2-21 4 3 3000 start date 2004-2-22 5 3 3000 end date 2006-8-21 
Sign up to request clarification or add additional context in comments.

Comments

0

Here's a way to do what you've asked that gives the output with the columns labeled as in your question:

cols = ['start date', 'end date'] data = pd.concat([data.drop(columns=set(cols)-{col}).rename(columns={col:'date'}) for col in cols]).sort_values('id').reset_index(drop=True) 

Output:

 id date values 0 1 2001-11-13 1000 1 1 2002-6-25 1000 2 2 2002-6-26 2000 3 2 2004-2-21 2000 4 3 2004-2-22 3000 5 3 2006-8-21 3000 

Comments