Issues plotting huge scatter plot in python. Is there a better way to do this?

Question

I built a huge ndarray object which is 2059263, 2 dimensions (one column for x-axis and another for y-axis). I tried to plot a scatter plot using different color of points. I created another same size list of arrays, which contain the information of the origin ndarray. So I tried to plot the scatter plot based on the category of information list combined with ndarray axis information using matplotlib.pyplot. I tried to use if and elif condition loop, but the loop is infinite... Is there another way to deal with a huge axis array to plot the scatter plot...? Or did I make a mistake in coding? Is there a faster way to do this? Probably the problem is that the origin axis array and information of each points are separated... attached my code below:

import re import matplotlib.pyplot as plt cdict = {0 : 'b', 1 : 'c', 2 : 'g', 3 : 'm', 4 : 'r', 5 : 'y', 6 : 'brown', 7 : 'gold', 8 : 'lightseagreen', 9 : 'indigo', 10 : 'maroon', 11 : 'cyan', 12 : 'olive', 13 : 'deeppink', 14 : 'sienna', 15 : 'crimson', 16 : 'peru', 17 : 'lime', 18 : 'navy', 19 : 'orange'} count = 1 for i in range(len(mapping)): if count != int(atr_list[i][1]): print("wrong sequence") print(count) print(atr_list[i]) break else: attri = re.search('^\d{3}[0-9]', atr_list[i][0]) if int(attri.group()) < 2001: colo = 0 elif int(attri.group()) > 2000 and int(attri.group()) < 2006: colo = 1 elif int(attri.group()) > 2005 and int(attri.group()) < 2011: colo = 2 elif int(attri.group()) > 2010 and int(attri.group()) < 2016: colo = 3 elif int(attri.group()) > 2015: colo = 4 plt.scatter(mapping[i, 0], mapping[i, 1], c=cdict[colo]) count += 1 plt.xlim(mapping[:, 0].min(), mapping[:, 0].max()) # plt.ylim(mapping[:, 1].min(), mapping[:, 1].max()) # plt.xlabel('t-SNE_x') # plt.ylabel('t-SNE_y') # plt.show() #

take a look at seaborn hexbin.

Pierre D
– Pierre D

2021-04-24 12:51:56 +00:00
Commented Apr 24, 2021 at 12:51 — Pierre D
– Pierre D, Commented Apr 24, 2021 at 12:51

유승환 · Accepted Answer · 2021-04-26 05:41:36Z

I figured out that that was my coding mistake. I should have wrote plt.scatter() function not in the loop but outside of the loop. Since I plot the scatter plot million time, it was seemed like an infinite loop. I was stupid.

Collectives™ on Stack Overflow

Issues plotting huge scatter plot in python. Is there a better way to do this?

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related