I have three dataframes
Dataframe df1:
date A 0 2022-04-11 1 1 2022-04-12 2 2 2022-04-14 26 3 2022-04-16 2 4 2022-04-17 1 5 2022-04-20 17 6 2022-04-21 14 7 2022-04-22 1 8 2022-04-23 9 9 2022-04-24 1 10 2022-04-25 5 11 2022-04-26 2 12 2022-04-27 21 13 2022-04-28 9 14 2022-04-29 17 15 2022-04-30 5 16 2022-05-01 8 17 2022-05-07 1241217 18 2022-05-08 211 19 2022-05-09 1002521 20 2022-05-10 488739 21 2022-05-11 12925 22 2022-05-12 57 23 2022-05-13 8515098 24 2022-05-14 1134576 Dateframe df2:
date B 0 2022-04-12 8 1 2022-04-14 7 2 2022-04-16 2 3 2022-04-19 2 4 2022-04-23 2 5 2022-05-07 2 6 2022-05-08 5 7 2022-05-09 2 8 2022-05-14 1 Dataframe df3:
date C 0 2022-04-12 6 1 2022-04-13 1 2 2022-04-14 2 3 2022-04-20 3 4 2022-04-21 9 5 2022-04-22 25 6 2022-04-23 56 7 2022-04-24 49 8 2022-04-25 68 9 2022-04-26 71 10 2022-04-27 40 11 2022-04-28 44 12 2022-04-29 27 13 2022-04-30 34 14 2022-05-01 28 15 2022-05-07 9 16 2022-05-08 20 17 2022-05-09 24 18 2022-05-10 21 19 2022-05-11 8 20 2022-05-12 8 21 2022-05-13 14 22 2022-05-14 25 23 2022-05-15 43 24 2022-05-16 36 25 2022-05-17 29 26 2022-05-18 28 27 2022-05-19 17 28 2022-05-20 6 I would like to merge df1, df2, df3 in a single dataframe with columns date, A, B, C, in such a way that date contains all dates which appeared in df1 and/or df2 and/or df3 (without repetition), and if a particular date was not in any of the dataframes, then for the respective column I put value 0.0. So, I would like to have something like that:
date A B C 0 2022-04-11 1.0 0.0 0.0 1 2022-08-12 2.0 8.0 6.0 2 2022-08-13 0.0 0.0 1.0 ... I tried to use this method
merge1 = pd.merge(df1, df2, how='outer') sorted_merge1 = merge1.sort_values(by=['date'], ascending=False) full_merge = pd.merge(sorted_merg1, df3, how='outer') However, it seems it skips the dates which are not common for all three dataframes.
onis missing?dfs = [df.set_index(['date']) for df in [df1, df2, df3]]and thenpd.concat(dfs,axis=1).reset_index().fillna(0). You could of course also create adf.merge()chain (as here, but that won't scale.