I have the following challenge: I have a dataframe called hashtags_users_grouped which has the following structure:
hashtag_id | user_id | count 123 1 1 245 1 3 123 2 5 In each row, we find values that tell me when a certain user mentioned a certain hashtag and how many times he did it. In this example, user 1 mentioned hashtag 123 one time and 245 three times, while user 2 only mentioned hashtag 123 five times.
I want to have a dataframe with the following output:
user | 123 | 245 1 1 3 2 5 0 In other words, the same information as the first table, but with a column per hashtag, to know the amount of times a user mentioned each hashtag. I read the documentation and tried to run the following, without success:
a = hashtags_users_joined_grouped_df.groupBy("user_id").pivot("hashtag_id") a.show(5) I got the following error message:
AttributeError: 'GroupedData' object has no attribute 'show' Do you know any way to do this?