I'm have a dataframe called stores_df that contains store information such as date and sales. I have another dataframe called avg_sales_store_by_month that contains the average sales for each month of each store. I wish to get the average sales column from this to append it to stores_df. The issue i have is after my join, the order of stores_df gets changed
Below is the first few rows from stores_df.
+-----+----------+---------+----+------------+-----------+----------+-----------+------------+-----+----+---+ |Store| Date|IsHoliday|Dept|Weekly_Sales|Temperature|Fuel_Price| CPI|Unemployment|Month|Year|Day| +-----+----------+---------+----+------------+-----------+----------+-----------+------------+-----+----+---+ | 1|2010-02-05| FALSE| 1| 24924| 42.31| 2.572|211.0963582| 8.106| 2|2010| 5| | 1|2010-02-12| TRUE| 1| 46039| 38.51| 2.548|211.2421698| 8.106| 2|2010| 12| | 1|2010-02-19| FALSE| 1| 41595| 39.93| 2.514|211.2891429| 8.106| 2|2010| 19| | 1|2010-05-14| FALSE| 1| 18926| 74.78| 2.854|210.3374261| 7.808| 5|2010| 14| +-----+----------+---------+----+------------+-----------+----------+-----------+------------+-----+----+---+ Below is the first few rows of avg_sales_store_by_month, i wish to grab the last column and append it to the end of stores_df.
+-----+-----+------------------+ |Store|Month|avg_sales_by_month| +-----+-----+------------------+ | 39| 11| 23317.75| | 43| 7| 13090.84| | 10| 2| 28407.05| | 23| 6| 21265.7| | 4| 10| 28723.2| | 9| 10| 8468.2| +-----+-----+------------------+ My issue is when i use my join:
stores_df = stores_df.join( avg_sales_store_by_month, Seq("Store", "Month"), "left" ) The rows of stores_df gets reordered, I would like it to be in the same order as before the join but with the extra column. How do i achieve this?
After the join snippet, order is messed up.
+-----+-----+----------+---------+----+------------+-----------+----------+-----------+------------+----+---+------------------+ |Store|Month| Date|IsHoliday|Dept|Weekly_Sales|Temperature|Fuel_Price| CPI|Unemployment|Year|Day|avg_sales_by_month| +-----+-----+----------+---------+----+------------+-----------+----------+-----------+------------+----+---+------------------+ | 39| 11|2010-11-05| FALSE| 1| 31729| 61.62| 2.689|210.7202444| 8.476|2010| 5| 23317.75| | 39| 11|2010-11-12| FALSE| 1| 12324| 62.21| 2.728|210.7667944| 8.476|2010| 12| 23317.75| | 39| 11|2010-11-19| FALSE| 1| 15137| 55.5| 2.771| 210.65429| 8.476|2010| 19| 23317.75| | 39| 11|2011-11-11| FALSE| 2| 65758| 63.11| 3.297|216.7217373| 7.716|2011| 11| 23317.75| | 39| 11|2011-11-18| FALSE| 2| 70050| 66.09| 3.308|216.9395861| 7.716|2011| 18| 23317.75| +-----+-----+----------+---------+----+------------+-----------+----------+-----------+------------+----+---+------------------+