1

First time poster, couldn't find anything that totally solved my issue.

I'm working on a simulation for galactic colonisation for my masters project. A thing I'm trying to do is look a the voids of uncolonised stars left over after the simulation ends and to see if there is clustering behaviour past statistical fluctuations. As it is a monte-carlo numerical problem, a correlation function is no really appropriate so I am using the counts-in-cells method usually employed to look at galaxy clusters.

So I am working in cartesians

data = np.genfromtxt('counts.csv') # positions of uncolonsed stars x = data[:,0] y = data[:,1] z = data[:,2] 

What I want to do is use boxes of varying sizes to count the number of stars inside the box and compare to what the mean should be and do statistics on the results.

The direction I'm going in is looking at some sort of 3D histogram such as the bubble plot seen here. I tried this out and it doesn't seem to be binning all my data and I'm not sure why, i.e, the 'floor' of the cube has 'bubbles' but much of the 'roof' has nothing:

3D Bubble Histogram

This is clearly wrong when you look at the plotted raw star field:

Plotted star field

It looks like the bins at higher z values aren't holding any data. This is probably a pretty straightforward problemm but I am in need of some fresh eyes and minds that are better at python than I am.

Can anyone think as to how this is can be fixed? Also I'd like to find a way to count the number of points per box, i.e per bin.

I'm sorry if I'm being a bit dim but I appreciate any help any of you fine fellows can offer me.

Thanks chums!

4
  • The code in the answer you linked looks overly complicated to me, maybe I just don't understand what they are trying to do. To compute histograms on any dimension just use numpy.histogramdd. Commented Dec 9, 2016 at 13:49
  • You can run a KDE on your sample and visualize it through volume rendering. See, for example, here. Commented Dec 9, 2016 at 14:45
  • The KDE treatment looks really good, thank you Vlas. Only thing left for me to do is count number of points per bin. Confused as to why scipy.stats.binned_statistic_dd is coming up with errors like: TypeError: binned_statistic_dd() takes at least 2 arguments (5 given) Commented Dec 9, 2016 at 22:32
  • 1
    @Skippeh Difficult to tell what's wrong with code we can't see. As I already said you can count points per bin using numpy.histogramdd. Commented Dec 10, 2016 at 22:27

1 Answer 1

1

In the comments you had some alternatives to solve your problem and its difficult to say what is wrong with your code without seeing it. In any case this kind of problem is typically solved by counting data inside a regular grid (which is, nevertheless, a general approach to do an histogram).

The advantage in building your own grid is that you know immediately where every "sector" is, where it starts and where it ends. As so I would suggest the following approach as an alternative if you want to try it.

import numpy as np from mpl_toolkits.mplot3d import Axes3D import matplotlib.pyplot as plt # Generating some random data. data = np.random.randint(0, 100, (1000,3)) x, y, z = data[:, 0], data[:, 1], data[:, 2] # Generating raw view fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(x, y, z, marker='+', s=25, c='r') plt.show() # Generating some grid with origin, cell size, and number of cells 10 10 10 numx, numy, numz = 5, 5, 5 origx, origy, origz = 0, 0, 0 sizex, sizey, sizez = 20, 20, 20 grid = np.vstack(np.meshgrid(range(numx), range(numy), range(numz))).reshape(3, -1).T gx, gy, gz = grid[:, 0]*sizex + origx, grid[:, 1]*sizey + origy, grid[:, 2]*sizez + origz # Calculating the number of stars in each cell: ix = ((x - origx)/sizex).astype(int) iy = ((y - origy)/sizey).astype(int) iz = ((z - origz)/sizez).astype(int) s = np.zeros((numx, numy, numz)) for i in range(ix.shape[0]): s[ix[i], iy[i], iz[i]] = s[ix[i], iy[i], iz[i]] + 1 s = s.flatten() mask = s > 0 # Plotting the result fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(gx[mask], gy[mask], gz[mask], marker='o', s=s[mask]*100, c='b', edgecolor ="r") plt.show() 

The result for randomized data is this:

bubble histogram in matplotlib

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.