1

I have an array of 500000 samples i.e., the data's shape is (500000, 3) where the first two columns represent x-coordinate and y- coordinate, and the third column is Label values to which the datapoint @ (X,Y) belongs.

for example:- data= [ [20,10, 12.3320], [22, 13, 230.221],.....[..] ]

I tried the below method. But this is too time consuming and poorly interpreted.

import matplotlib.pyplot as plt colors = 10*['r.','g.','b.','c.','k.','y.','m.'] for i in range(len(labels)): plt.scatter(data[i][0], data[i][1], colors[labels[i]],marker='.') plt.show() 

Is there any other method like imshow() or other which is suitable for the above code which leads to good interpretation?

4
  • In order to use imshow the data must be equally spaced on a grid. Is this the case? Can you tell us more how your data is structured in the columns? Commented Feb 1, 2017 at 23:08
  • The data structure is like this array([[ 0.19975574, 0.10402092, 0.00029645], [ 0.19975574, 0.10727158, 0.00029645], [ 0.19975574, 0.11052223, 0.00029645], [ 0.19975574, 0.11377289, 0.00029645], [ 0.19975574, 0.11702354, 0.00029645], [ 0.19975574, 0.12027419, 0.00029645], [ 0.19975574, 0.12352485, 0.00029645], [ 0.19975574, 0.1267755 , 0.00029645], [ 0.19975574, 0.13002616, 0.00029645], [ 0.19975574, 0.13327681, 0.00029645],...........]) Commented Feb 2, 2017 at 10:04
  • the data is scaled to have unit variance in each axis.. So the data looks above. Commented Feb 2, 2017 at 10:06
  • don't put your data into the comments. Also you can answer questions from the comment section simply by editing your question. Showing the original data makes things a bit complicated. To see the structure, use some other data, in the sense of a minimal reproducible example. Commented Feb 2, 2017 at 10:11

1 Answer 1

2

The scatter function in matlplotlib is quiet slow, I would recommend to use vispy that use the GPU to plot a large number of points :

Works with vispy 0.4.0 that you can install with pip or conda :

pip install vispy 

Here is the code (plotted in less than 2sec on my computer):

import numpy as np from vispy import scene, visuals, app import matplotlib.pyplot as plt data = np.random.random((500000,3)) canvas = scene.SceneCanvas(keys='interactive', show=True) view = canvas.central_widget.add_view() # Create the scatter plot scatter = scene.visuals.Markers() scatter.set_data(data[:,:2], face_color=plt.cm.jet(data[:,2])) view.add(scatter) view.camera = scene.PanZoomCamera(aspect=1) view.camera.set_range() app.run() 

there is a nice documentation for vispy and you can customize your plot in the set_data function with arguments like face_color, edge_color, size, edge_width, symbol ...

Good luck with your data visualization ;)

Sign up to request clarification or add additional context in comments.

3 Comments

Note if you get a black screen with no markers, there's an issue in vispy: github.com/vispy/vispy/issues/1085
Hi there, I run this code on ipython notebook and I received nothing. Do you know what is the issue?
Try to launch it as a python script and not as a ipython notebook :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.