I have a dataset with multiple categories and I want to plot in a single figure to see how something changes. I have a list of given categories in the data set that I'm would like to see it all plot in the same figure
sample = [ ['For business', 0.7616104043587437], ['For home and cottages', 0.6890139579274699], ['Consumer electronics', 0.039868871866136635], ['Personal things', 0.7487893699793786], ['Services', 0.747226678171249], ['Services', 0.23463661173977313], ['Animals', 0.6504301798258314], ['For home and cottages', 0.49567857024037665], ['For home and cottages', 0.9852681814098107], ['Transportation', 0.8134867587477912], ['Animals', 0.49988690699674654], ['Consumer electronics', 0.15086800344617235], ['For business', 0.9485494576819328], ['Hobbies and Leisure', 0.25766871111905243], ['For home and cottages', 0.31704508627659533], ['Animals', 0.6192114570078333], ['Personal things', 0.5755788287287359], ['Hobbies and Leisure', 0.10106922056341394], ['Animals', 0.16834618003738577], ['Consumer electronics', 0.7570803588496894] ] train = pd.DataFrame(data=sample, columns=['parent_category_name','deal_probability']) parent_categories = train['parent_category_name'].unique() parent_categories_size = len(parent_categories) fig, ax = plt.subplots(figsize=(12,10)) colors = iter(cm.rainbow(np.linspace(0, 1, parent_categories_size))) for parent_category_n in range(parent_categories_size): parent_1 = train[train['parent_category_name'] == parent_categories[parent_category_name]] ax.scatter( range(parent_1.shape[0]), np.sort(parent_1.deal_probability.values), color = next(colors) ) plt.ylabel('likelihood that an ad actually sold something', fontsize=12) plt.title('Distribution of likelihood that an ad actually sold something') I've no idea why I can only see the last plot instead of all of them. Alternatively I could work with having multiple scatter plots in one figure, but I'm having a hard time trying to plot this.
Currently I'm working with 10 categories but I'm trying to make it dynamic.


ylabelinstead ofxlabel? This like plt.title doesn't need to be within the loop, because you only have to set it once. 2) Why do you retrieve firstparent_categoriesfrom your dataframe and overwrite it then with a predefined list? 3) Your code does not use the categorical data, instead plots the probabilities in ascending order against the position number within the category. Is this the intention?