1

Couldn't find an answer to my question.

I have the following code which generates the scatter plot below.

scatter_matrix(iris_ds) plt.show() 

enter image description here

However, I can't seem to be able to change the colour of the points on the plots, in order to distinguish the data points.

Any suggestions?

Edit: for clarity - there are 3 sets of data points in each scatter plot box. I was wondering if there is a way to:

  • Change the colour from blue?
  • Change the colours depending on where on the plots the data points appear?
6
  • 1
    First and foremost, you should read How to create a Minimal, Complete, and Verifiable example. Next, you should clarify what points you mean when you say "in order to distinguish the data points.". There are 12 subplots with the scatter points Commented Apr 27, 2019 at 18:24
  • Thought I was quite clear, but will clarify. Commented Apr 27, 2019 at 18:27
  • 1
    seaborn.pydata.org/examples/scatterplot_matrix.html concidentally, the exact same dataset, plotted in multiple colours. Commented Apr 27, 2019 at 18:47
  • @warped except I'm using Pandas and Matplotlib Commented Apr 27, 2019 at 18:57
  • 1
    @Clauric just curious, why can't you use sns? It fits your need in just one line of code. Commented Apr 27, 2019 at 20:55

1 Answer 1

4

If you look at the source of pd.plotting.scatter_matrix:

def scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False, diagonal='hist', marker='.', density_kwds=None, hist_kwds=None, range_padding=0.05, **kwds): # <--- [...] # Deal with the diagonal by drawing a histogram there. if diagonal == 'hist': ax.hist(values, **hist_kwds) # <--- [...] else: common = (mask[a] & mask[b]).values ax.scatter(df[b][common], df[a][common], marker=marker, alpha=alpha, **kwds) # <--- 

you see that the function takes **kwds and passes them to ax.scatter

so, you can either feed colors directly:

colors = iris['species'].replace({'setosa':'red', 'virginica': 'green', 'versicolor':'blue'}) pd.plotting.scatter_matrix(iris, c=colors); 

or you convert the species to numbers, and use a colormap:

colors = iris['species'].replace({'setosa':1, 'virginica': 2, 'versicolor':3}) pd.plotting.scatter_matrix(iris, c=colors, cmap='viridis'); 

further, the function takes density_kwds and hist_kwds and passes them to ax.plot and ax.hist, respoectively. So, you can change the colour of the histograms by passing a dictionary. Ditto for the kdeplots:

pd.plotting.scatter_matrix(iris, hist_kwds={'color':'red'}) 
Sign up to request clarification or add additional context in comments.

1 Comment

Same info in the docs.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.