Setting different color for each series in scatter plot

Question

Suppose I have three data sets:

X = [1,2,3,4] Y1 = [4,8,12,16] Y2 = [1,4,9,16]

I can scatter plot this:

from matplotlib import pyplot as plt plt.scatter(X,Y1,color='red') plt.scatter(X,Y2,color='blue') plt.show()

How can I do this with 10 sets?

I searched for this and could find any reference to what I'm asking.

Edit: clarifying (hopefully) my question

If I call scatter multiple times, I can only set the same color on each scatter. Also, I know I can set a color array manually but I'm sure there is a better way to do this. My question is then, "How can I automatically scatter-plot my several data sets, each with a different color.

If that helps, I can easily assign a unique number to each data set.

Whats the quesiton here? Color can be an array as well, but what can you not solve with just calling scatter multiple times? — seberg
– seberg, Commented Sep 2, 2012 at 14:22
If I call scatter multiple times, I get the same colors. I'll update my question. — Yotam
– Yotam, Commented Sep 2, 2012 at 14:24

vvvvv · Accepted Answer · 2020-07-09 20:58:48Z

I don't know what you mean by 'manually'. You can choose a colourmap and make a colour array easily enough:

import numpy as np import matplotlib.pyplot as plt import matplotlib.cm as cm x = np.arange(10) ys = [i+x+(i*x)**2 for i in range(10)] colors = cm.rainbow(np.linspace(0, 1, len(ys))) for y, c in zip(ys, colors): plt.scatter(x, y, color=c)

Or you can make your own colour cycler using itertools.cycle and specifying the colours you want to loop over, using next to get the one you want. For example, with 3 colours:

import itertools colors = itertools.cycle(["r", "b", "g"]) for y in ys: plt.scatter(x, y, color=next(colors))

Come to think of it, maybe it's cleaner not to use zip with the first one neither:

colors = iter(cm.rainbow(np.linspace(0, 1, len(ys)))) for y in ys: plt.scatter(x, y, color=next(colors))

+1. An itertools cycle probably isn't a good idea in this situation though, since it would end up with multiple datasets having the same color.
@DavidRobinson: not if you specify all ten, although I agree cycling sort of defeats the purpose there.. :^)
@macrocosme: works for me. Adding plt.legend(['c{}'.format(i) for i in range(len(ys))], loc=2, bbox_to_anchor=(1.05, 1), borderaxespad=0., fontsize=11) to the bottom the above gives me a legend with colours.
the itertools solution is great when you want to avoid some colours. In my case since the background is black I want to avoid black.

Nikana Reklawyks · Accepted Answer · 2016-10-31 21:30:15Z

The normal way to plot plots with points in different colors in matplotlib is to pass a list of colors as a parameter.

E.g.:

import matplotlib.pyplot matplotlib.pyplot.scatter([1,2,3],[4,5,6],color=['red','green','blue'])

When you have a list of lists and you want them colored per list. I think the most elegant way is that suggesyted by @DSM, just do a loop making multiple calls to scatter.

But if for some reason you wanted to do it with just one call, you can make a big list of colors, with a list comprehension and a bit of flooring division:

import matplotlib import numpy as np X = [1,2,3,4] Ys = np.array([[4,8,12,16], [1,4,9,16], [17, 10, 13, 18], [9, 10, 18, 11], [4, 15, 17, 6], [7, 10, 8, 7], [9, 0, 10, 11], [14, 1, 15, 5], [8, 15, 9, 14], [20, 7, 1, 5]]) nCols = len(X) nRows = Ys.shape[0] colors = matplotlib.cm.rainbow(np.linspace(0, 1, len(Ys))) cs = [colors[i//len(X)] for i in range(len(Ys)*len(X))] #could be done with numpy's repmat Xs=X*nRows #use list multiplication for repetition matplotlib.pyplot.scatter(Xs,Ys.flatten(),color=cs)

cs = [array([ 0.5, 0. , 1. , 1. ]), array([ 0.5, 0. , 1. , 1. ]), array([ 0.5, 0. , 1. , 1. ]), array([ 0.5, 0. , 1. , 1. ]), array([ 0.28039216, 0.33815827, 0.98516223, 1. ]), array([ 0.28039216, 0.33815827, 0.98516223, 1. ]), array([ 0.28039216, 0.33815827, 0.98516223, 1. ]), array([ 0.28039216, 0.33815827, 0.98516223, 1. ]), ... array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17, 1.00000000e+00]), array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17, 1.00000000e+00]), array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17, 1.00000000e+00]), array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17, 1.00000000e+00])]

This is really great for a scatter plot where I have daily data of how big a text file is, and if I added less than, say, 200 bytes, I make a point red, but otherwise it is green.

jtlz2 · Accepted Answer · 2023-09-12 20:10:40Z

#An easy fix If you have only one type of collections (e.g. scatter with no error bars) you can also change the colours after that you have plotted them, this sometimes is easier to perform.

import matplotlib.pyplot as plt from random import randint import numpy as np #Let's generate some random X, Y data X = [ [frst group],[second group] ...] X = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)] Y = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)] labels = range(1,len(X)+1) fig = plt.figure() ax = fig.add_subplot(111) for x,y,lab in zip(X,Y,labels): ax.scatter(x,y,label=lab)

#The only piece of code that you need:

#Now this is actually the code that you need, an easy fix your colors just cut and paste not you need ax. colormap = plt.cm.gist_ncar #nipy_spectral, Set1,Paired colorst = [colormap(i) for i in np.linspace(0, 0.9,len(ax.collections))] for t,j1 in enumerate(ax.collections): j1.set_color(colorst[t]) ax.legend(fontsize='small')

The output gives you different colors even when you have many different scatter plots in the same subplot.

that's great but how would you for example add errorbars with the same color with this function? @G M
Hi @PEBKAC, thanks for pointing it out, I've tried hard this afternoon to make it work also in that case but I couldn't find any solution so I edited the question and warned the other users. Thanks!
Hi @G M, sorry I posted a few comments before having finalized the solution, which is described here: stackoverflow.com/q/51444364/7541421
I used another method to assign the colors for each series in a scatter plot. Now it works, unfortunately I couldn't proceed with your elegant solution when it came to errorbars, still I'm really grateful for your super helpful post! Cheers!

MERose · Accepted Answer · 2016-11-23 09:15:18Z

You can always use the plot() function like so:

import matplotlib.pyplot as plt import numpy as np x = np.arange(10) ys = [i+x+(i*x)**2 for i in range(10)] plt.figure() for y in ys: plt.plot(x, y, 'o') plt.show()

plot as scatter but changes colors

Hualin · Accepted Answer · 2015-09-18 12:26:25Z

This question is a bit tricky before Jan 2013 and matplotlib 1.3.1 (Aug 2013), which is the oldest stable version you can find on matpplotlib website. But after that it is quite trivial.

Because present version of matplotlib.pylab.scatter support assigning: array of colour name string, array of float number with colour map, array of RGB or RGBA.

this answer is dedicate to @Oxinabox's endless passion for correcting the 2013 version of myself in 2015.

you have two option of using scatter command with multiple colour in a single call.

as pylab.scatter command support use RGBA array to do whatever colour you want;
back in early 2013, there is no way to do so, since the command only support single colour for the whole scatter point collection. When I was doing my 10000-line project I figure out a general solution to bypass it. so it is very tacky, but I can do it in whatever shape, colour, size and transparent. this trick also could be apply to draw path collection, line collection....

the code is also inspired by the source code of pyplot.scatter, I just duplicated what scatter does without trigger it to draw.

the command pyplot.scatter return a PatchCollection Object, in the file "matplotlib/collections.py" a private variable _facecolors in Collection class and a method set_facecolors.

so whenever you have a scatter points to draw you can do this:

# rgbaArr is a N*4 array of float numbers you know what I mean # X is a N*2 array of coordinates # axx is the axes object that current draw, you get it from # axx = fig.gca() # also import these, to recreate the within env of scatter command import matplotlib.markers as mmarkers import matplotlib.transforms as mtransforms from matplotlib.collections import PatchCollection import matplotlib.markers as mmarkers import matplotlib.patches as mpatches # define this function # m is a string of scatter marker, it could be 'o', 's' etc.. # s is the size of the point, use 1.0 # dpi, get it from axx.figure.dpi def addPatch_point(m, s, dpi): marker_obj = mmarkers.MarkerStyle(m) path = marker_obj.get_path() trans = mtransforms.Affine2D().scale(np.sqrt(s*5)*dpi/72.0) ptch = mpatches.PathPatch(path, fill = True, transform = trans) return ptch patches = [] # markerArr is an array of maker string, ['o', 's'. 'o'...] # sizeArr is an array of size float, [1.0, 1.0. 0.5...] for m, s in zip(markerArr, sizeArr): patches.append(addPatch_point(m, s, axx.figure.dpi)) pclt = PatchCollection( patches, offsets = zip(X[:,0], X[:,1]), transOffset = axx.transData) pclt.set_transform(mtransforms.IdentityTransform()) pclt.set_edgecolors('none') # it's up to you pclt._facecolors = rgbaArr # in the end, when you decide to draw axx.add_collection(pclt) # and call axx's parent to draw_idle()

so it is kinda complicated to read and in 2013 I used python for 1 year. so why would people want to know how to do it? after get it worked, I never bother to look at it again. my project was to draw a lot of visualisation, with above code, the work flow was streamlined.

MdM · Accepted Answer · 2020-03-07 16:58:28Z

A MUCH faster solution for large dataset and limited number of colors is the use of Pandas and the groupby function:

import pandas as pd import numpy as np import matplotlib.pyplot as plt import time # a generic set of data with associated colors nsamples=1000 x=np.random.uniform(0,10,nsamples) y=np.random.uniform(0,10,nsamples) colors={0:'r',1:'g',2:'b',3:'k'} c=[colors[i] for i in np.round(np.random.uniform(0,3,nsamples),0)] plt.close('all') # "Fast" Scatter plotting starttime=time.time() # 1) make a dataframe df=pd.DataFrame() df['x']=x df['y']=y df['c']=c plt.figure() # 2) group the dataframe by color and loop for g,b in df.groupby(by='c'): plt.scatter(b['x'],b['y'],color=g) print('Fast execution time:', time.time()-starttime) # "Slow" Scatter plotting starttime=time.time() plt.figure() # 2) group the dataframe by color and loop for i in range(len(x)): plt.scatter(x[i],y[i],color=c[i]) print('Slow execution time:', time.time()-starttime) plt.show()

Thanks for the groupby tip in this context, a world of possibilities, thank you!

Pang · Accepted Answer · 2018-08-07 05:02:28Z

1

This works for me:

for each series, use a random rgb colour generator

c = color[np.random.random_sample(), np.random.random_sample(), np.random.random_sample()]

edited Aug 7, 2018 at 5:02

Pang

10.2k146 gold badges87 silver badges126 bronze badges

answered Aug 7, 2018 at 4:33

bracoo

1512 bronze badges

1 Comment

Joel Carneiro Over a year ago

I do not know what is your color variable, but using your approach it is possible to do something like:

plt.scatter(your values to the graph, color= (np.random.random_sample(), np.random.random_sample(), np.random.random_sample()) )

. You mentioned a RGB generator and you declared a RGB list, generators are declared between '()'

i.s. may · Accepted Answer · 2021-05-30 16:39:15Z

You can also create a list of colors which includes all the colors you need in your scatter plot and give it as a parameter inside like:

colors = ["red", "blue", "green"] plt.scatter(X, Y, color = colors)

Collectives™ on Stack Overflow

Setting different color for each series in scatter plot

8 Answers 8

5 Comments

1 Comment

4 Comments

Comments

1 Comment

1 Comment

1 Comment

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

5 Comments

1 Comment

4 Comments

Comments

1 Comment

1 Comment

1 Comment

1 Comment

Linked

Related