Here I have a DataFrame like below:
>>> import pandas as pd >>> import numpy as np >>> df = pd.DataFrame() >>> df["user_id"] = [1,1,1,2,2,3,4,4,4,4] >>> df["cate"] = ["a","b","c","b","c","a","a","b","c","d"] >>> df["prob"] = [np.random.rand() for _ in range(len(df["user_id"]))] I want to convert the pro of each cate as a new columns of the user(user_id),like this:
The only solution to solve this problem is using for loop,when I have tens of thousands users, it's very very slowly!
user_ids = list(set(df["user_id"])) cates = list(set(df["cate"])) user_probs = pd.DataFrame() for uid in user_ids: d = pd.DataFrame({'user_id': [uid]}) for c in cates: ratio = df[(df["user_id"] == uid) & (df["cate"] == c)]["prob"] ratio = 0 if len(ratio)==0 else float(ratio) d["cate_"+c+"_prob"] = ratio user_probs = pd.concat([user_probs, d]) So, Does Pandas have built-in method to solve this problem? Thank you very much!

