I have tried to find random points on the NxM dataset based on the lowest value of each M as low range and the highest value of each M on as high range.
Here is the code:
def generate_random_points(dataset, dimension_based=False): dimension = dataset.shape[1] if dimension_based == False: row_size = np.floor((np.sqrt(dimension))).astype(int) if np.floor(np.sqrt(dimension)).astype(int) < np.floor(np.sqrt(dataset.shape[0])).astype(int) else np.floor((np.sqrt(dataset.shape[0]))).astype(int) generated_spikes = np.random.uniform(low=np.min(dataset, axis=0), high=np.max(dataset, axis=0), size=(row_size, dimension)) return generated_spikes else: row_size = np.floor((np.sqrt(dimension))).astype(int) generated_spikes = np.random.uniform(low=np.min(dataset, axis=0), high=np.max(dataset, axis=0), size=(row_size, dimension)) return generated_spikes But the problem is most of the random points lies on the boundaries or edges of dataset spaces rather than being uniformly and evenly distributed
Here is a plot of one example: random points are black ones
I have also tried doing PCA and then apply the high and low range by doing inverse_transform to the ranges but kind of expectedly, the random points are not distributed uniformly and evenly
def generate_random_points(dataset,dimension_based= False): dimension = dataset.shape[1] dimension_pca = dataset.shape[0] if dataset.shape[0] < dataset.shape[1] else dataset.shape[1] pca, dataset_pca = perform_PCA(dimension_pca, dataset) low_pca = np.min(dataset_pca, axis=0) high_pca = np.max(dataset_pca, axis=0) low = perform_PCA_inverse(pca, low_pca) high = perform_PCA_inverse(pca, high_pca) if dimension_based == False: row_size = np.floor((np.sqrt(dimension))).astype(int) if np.floor(np.sqrt(dimension)).astype(int) < np.floor(np.sqrt(dataset.shape[0])).astype(int) else np.floor((np.sqrt(dataset.shape[0]))).astype(int) generated_spikes = np.random.uniform(low=low, high=high, size=(row_size, dimension)) return generated_spikes else: row_size = np.floor((np.sqrt(dimension))).astype(int) generated_spikes = np.random.uniform(low=np.min(dataset, axis=0), high=np.max(dataset, axis=0), size=(row_size, dimension)) return generated_spikes How to solve the issue such that the random generated points are more evenly distributed instead of piling up on two edges and also do not overlap?
I need like this:
the red one is the position required for the black points which are crossed
P.S:
Both of the image is a PCA representation of a dataset with shape of (46,2730) i.e. 46 rows and 2730 dimensions
I was thinking of using the 2nd answer of this question : algorithm for generating uniformly distributed random points on the N-sphere But I am not sure how to calculate the radius(R) of an N-dimensional dataset or even if it make sense so that I can use that 2nd answer on the link above.
Please help!

[-50, 85] x [-50, 85]. Though maybe you meant distributed like the points you have in your picture. In that case you would have to give me coordinates of the points.