58

I am plotting multiple dataframes as point plot using seaborn. Also I am plotting all the dataframes on the same axis.

How would I add legend to the plot ?

My code takes each of the dataframe and plots it one after another on the same figure.

Each dataframe has same columns

date count 2017-01-01 35 2017-01-02 43 2017-01-03 12 2017-01-04 27 

My code :

f, ax = plt.subplots(1, 1, figsize=figsize) x_col='date' y_col = 'count' sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_1,color='blue') sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_2,color='green') sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_3,color='red') 

This plots 3 lines on the same plot. However the legend is missing. The documentation does not accept label argument .

One workaround that worked was creating a new dataframe and using hue argument.

df_1['region'] = 'A' df_2['region'] = 'B' df_3['region'] = 'C' df = pd.concat([df_1,df_2,df_3]) sns.pointplot(ax=ax,x=x_col,y=y_col,data=df,hue='region') 

But I would like to know if there is a way to create a legend for the code that first adds sequentially point plot to the figure and then add a legend.

Sample output :

Seaborn Image

1
  • The datatype of date column can be assumed to be datetime.date. Commented Mar 13, 2017 at 15:31

4 Answers 4

50

I would suggest not to use seaborn pointplot for plotting. This makes things unnecessarily complicated.
Instead use matplotlib plot_date. This allows to set labels to the plots and have them automatically put into a legend with ax.legend().

import matplotlib.pyplot as plt import pandas as pd import seaborn as sns import numpy as np date = pd.date_range("2017-03", freq="M", periods=15) count = np.random.rand(15,4) df1 = pd.DataFrame({"date":date, "count" : count[:,0]}) df2 = pd.DataFrame({"date":date, "count" : count[:,1]+0.7}) df3 = pd.DataFrame({"date":date, "count" : count[:,2]+2}) f, ax = plt.subplots(1, 1) x_col='date' y_col = 'count' ax.plot_date(df1.date, df1["count"], color="blue", label="A", linestyle="-") ax.plot_date(df2.date, df2["count"], color="red", label="B", linestyle="-") ax.plot_date(df3.date, df3["count"], color="green", label="C", linestyle="-") ax.legend() plt.gcf().autofmt_xdate() plt.show() 

enter image description here


In case one is still interested in obtaining the legend for pointplots, here a way to go:

sns.pointplot(ax=ax,x=x_col,y=y_col,data=df1,color='blue') sns.pointplot(ax=ax,x=x_col,y=y_col,data=df2,color='green') sns.pointplot(ax=ax,x=x_col,y=y_col,data=df3,color='red') ax.legend(handles=ax.lines[::len(df1)+1], labels=["A","B","C"]) ax.set_xticklabels([t.get_text().split("T")[0] for t in ax.get_xticklabels()]) plt.gcf().autofmt_xdate() plt.show() 
Sign up to request clarification or add additional context in comments.

4 Comments

I agree with @ImportanceOfBeingErnest that using seaborn would make advance things bit complicated. Personally tradeoff is ease for simple plots and aesthetics vs complications and less documentation compared to matplotlib. I am going to wait for couple of hours to see if someone has an idea how to add legend to seaborn plots. If not I think this answer is correct and will accept it.
Ok, so if you are really interested in using pointplots, I added a way to get the legend for those.
What is the [::len(df1)+1] good for? As far as I can it copies ax.lines. But why the step argument? Could you add a comment to the code sample for that?
@exhuma The pointplot creates len(df) short error lines and one main line for every call. With [::len(df1)+1] you only select the main lines, not the error lines (which all have the same color).
29

Old question, but there's an easier way.

sns.pointplot(x=x_col,y=y_col,data=df_1,color='blue') sns.pointplot(x=x_col,y=y_col,data=df_2,color='green') sns.pointplot(x=x_col,y=y_col,data=df_3,color='red') plt.legend(labels=['legendEntry1', 'legendEntry2', 'legendEntry3']) 

This lets you add the plots sequentially, and not have to worry about any of the matplotlib crap besides defining the legend items.

4 Comments

however, for this solution, the legend colors are "blue" for all legend entries, instead of "blue", then "green", then "red"
Not when I use it!
AdamB, I get the desired behavior. Maybe it would help clear up some confusion as pointed out by @S.A. if you put the version of seaborn and platform information. As it stands, this solution is the simplest, given that it works ;)
@JosephWood You need the last part of the accepted answer (by Ernest), which skips all the short error lines. So, ax.legend(handles=ax.lines[::len(df_1)+1], labels=["A","B","C"]). However, if you add ci=None, there are no error bars, and no skipping is needed. In that case the simple solution here will work.
7

I tried using Adam B's answer, however, it didn't work for me. Instead, I found the following workaround for adding legends to pointplots.

import matplotlib.patches as mpatches red_patch = mpatches.Patch(color='#bb3f3f', label='Label1') black_patch = mpatches.Patch(color='#000000', label='Label2') 

In the pointplots, the color can be specified as mentioned in previous answers. Once these patches corresponding to the different plots are set up,

plt.legend(handles=[red_patch, black_patch]) 

And the legend ought to appear in the pointplot.

Comments

2

This goes a bit beyond the original question, but also builds on @PSub's response to something more general---I do know some of this is easier in Matplotlib directly, but many of the default styling options for Seaborn are quite nice, so I wanted to work out how you could have more than one legend for a point plot (or other Seaborn plot) without dropping into Matplotlib right at the start.

Here's one solution:

 import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # We will need to access some of these matplotlib classes directly from matplotlib.lines import Line2D # For points and lines from matplotlib.patches import Patch # For KDE and other plots from matplotlib.legend import Legend from matplotlib import cm # Initialise random number generator rng = np.random.default_rng(seed=42) # Generate sample of 25 numbers n = 25 clusters = [] for c in range(0,3): # Crude way to get different distributions # for each cluster p = rng.integers(low=1, high=6, size=4) df = pd.DataFrame({ 'x': rng.normal(p[0], p[1], n), 'y': rng.normal(p[2], p[3], n), 'name': f"Cluster {c+1}" }) clusters.append(df) # Flatten to a single data frame clusters = pd.concat(clusters) # Now do the same for data to feed into # the second (scatter) plot... n = 8 points = [] for c in range(0,2): p = rng.integers(low=1, high=6, size=4) df = pd.DataFrame({ 'x': rng.normal(p[0], p[1], n), 'y': rng.normal(p[2], p[3], n), 'name': f"Group {c+1}" }) points.append(df) points = pd.concat(points) # And create the figure f, ax = plt.subplots(figsize=(8,8)) # The KDE-plot generates a Legend 'as usual' k = sns.kdeplot( data=clusters, x='x', y='y', hue='name', shade=True, thresh=0.05, n_levels=2, alpha=0.2, ax=ax, ) # Notice that we access this legend via the # axis to turn off the frame, set the title, # and adjust the patch alpha level so that # it closely matches the alpha of the KDE-plot ax.get_legend().set_frame_on(False) ax.get_legend().set_title("Clusters") for lh in ax.get_legend().get_patches(): lh.set_alpha(0.2) # You would probably want to sort your data # frame or set the hue and style order in order # to ensure consistency for your own application # but this works for demonstration purposes groups = points.name.unique() markers = ['o', 'v', 's', 'X', 'D', '<', '>'] colors = cm.get_cmap('Dark2').colors # Generate the scatterplot: notice that Legend is # off (otherwise this legend would overwrite the # first one) and that we're setting the hue, style, # markers, and palette using the 'name' parameter # from the data frame and the number of groups in # the data. p = sns.scatterplot( data=points, x="x", y="y", hue='name', style='name', markers=markers[:len(groups)], palette=colors[:len(groups)], legend=False, s=30, alpha=1.0 ) # Here's the 'magic' -- we use zip to link together # the group name, the color, and the marker style. You # *cannot* retreive the marker style from the scatterplot # since that information is lost when rendered as a # PathCollection (as far as I can tell). Anyway, this allows # us to loop over each group in the second data frame and # generate a 'fake' Line2D plot (with zero elements and no # line-width in our case) that we can add to the legend. If # you were overlaying a line plot or a second plot that uses # patches you'd have to tweak this accordingly. patches = [] for x in zip(groups, colors[:len(groups)], markers[:len(groups)]): patches.append(Line2D([0],[0], linewidth=0.0, linestyle='', color=x[1], markerfacecolor=x[1], marker=x[2], label=x[0], alpha=1.0)) # And add these patches (with their group labels) to the new # legend item and place it on the plot. leg = Legend(ax, patches, labels=groups, loc='upper left', frameon=False, title='Groups') ax.add_artist(leg); # Done plt.show(); 

Here's the output: 2 Legends using Seaborn

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.