I'm about to learn python. I hope this is not a stupid question but I could really need some help with this:
I have the following DataFrame "df_sales_2016_us" with 2935 entries. This is the head.
Gender Country Size (US) year Month 7610 Female United States 8.0 2016 1 7613 Female United States 9.0 2016 1 7617 Male United States 9.5 2016 1 7618 Female United States 10.5 2016 1 7619 Male United States 8.5 2016 1 "Month" contains values from 1 to 12 for each month. "Size (US") contains 16 different shoe size from 6.0 to 15
I would now like to create a new DataFrame looking like this:
- The columns are the Month from 1 to 12 while the rows are the different shoe sizes from 6.0 to 15.
- The single values should be the number of shoes for each size sold in each month.
How can I achieve this? This a table created with the value "0" just to clearify my goal.
1 2 3 4 5 6 7 8 9 10 11 12 6.0 0 0 0 0 0 0 0 0 0 0 0 0 6.5 0 0 0 0 0 0 0 0 0 0 0 0 7.0 0 0 0 0 0 0 0 0 0 0 0 0 7.5 0 0 0 0 0 0 0 0 0 0 0 0 8.0 0 0 0 0 0 0 0 0 0 0 0 0 8.5 0 0 0 0 0 0 0 0 0 0 0 0 9.0 0 0 0 0 0 0 0 0 0 0 0 0 9.5 0 0 0 0 0 0 0 0 0 0 0 0 10.0 0 0 0 0 0 0 0 0 0 0 0 0 10.5 0 0 0 0 0 0 0 0 0 0 0 0 11.0 0 0 0 0 0 0 0 0 0 0 0 0 11.5 0 0 0 0 0 0 0 0 0 0 0 0 12.0 0 0 0 0 0 0 0 0 0 0 0 0 13.0 0 0 0 0 0 0 0 0 0 0 0 0 14.0 0 0 0 0 0 0 0 0 0 0 0 0 15.0 0 0 0 0 0 0 0 0 0 0 0 0 I tried to create the following DataFrame df_test, but I am out of options to proceed and think this is a totally wrong approach. (Here as an example only for "Month" "1"). Is there any option to get only the specific value needed for each column/row?
df_test = pd.DataFrame({'1':[df_sales_2016_us["Month"]==1], '2':[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '3':[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '4': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '5': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '6': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '7': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '8': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '9': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '10': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '11': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], '12': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] }, index=[df_sales_2016_us["Size (US)"].unique()]) df_test.sort_index() The result for the way above is of course the following table as my approach does not get the single values needed but just informs, if "Month" is "1". But I have no idea how to go on from here or how to switch to another way to solve this. If someone would have an idea for this I would be very grateful. Thank you so much!
1 2 3 4 5 6 7 8 9 10 11 12 6.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 6.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 7.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 7.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 8.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 8.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 9.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 9.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 10.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 10.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 11.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 11.5 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 12.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 13.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 14.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0 15.0 7617 True 7619 True 7629 True 7... 0 0 0 0 0 0 0 0 0 0 0