Skip to main content
minor formatting
Source Link
Matt
  • 19.5k
  • 4
  • 25
  • 64

I have a bunch of thousands of points. I'd like to group them, in a user-defined number of clusters for example or based on some kind of "best number of clusters", based for example on their relatives distances. Which kind of algorithm is available on postGIS and/or shapely (preferred) to do that? I can simply make some buffers around them and a union but I wonder if there is some other advanced technique that are computationally efficient on large dataset.


Edit:

I've found the DBSCAN algorithm [1] interesting and it seems rather simple to insert in an existing code. There is also the OPTICS algorithm [2] but it seems harder to find a solution.

[1]https://en.wikipedia.org/wiki/DBSCAN
[2]https://en.wikipedia.org/wiki/OPTICS_algorithm

Anyway, I'm stuck with DBSCAN now:
how.

How to store the cluster label in which a point lays within a new column "cluster"cluster in a postgreSQLPostgreSQL table? Actually, the points ID are not stored along their coordinates within the cluster...
Some checking operations on coordinates in order to retrieve points IDs in original array would be a bit overrated, in my opinion.

I have a bunch of thousands of points. I'd like to group them, in a user-defined number of clusters for example or based on some kind of "best number of clusters", based for example on their relatives distances. Which kind of algorithm is available on postGIS and/or shapely (preferred) to do that? I can simply make some buffers around them and a union but I wonder if there is some other advanced technique that are computationally efficient on large dataset.


Edit:

I've found the DBSCAN algorithm [1] interesting and it seems rather simple to insert in an existing code. There is also the OPTICS algorithm [2] but it seems harder to find a solution.

[1]https://en.wikipedia.org/wiki/DBSCAN
[2]https://en.wikipedia.org/wiki/OPTICS_algorithm

Anyway, I'm stuck with DBSCAN now:
how to store the cluster label in which a point lays within a new column "cluster" in a postgreSQL table? Actually, the points ID are not stored along their coordinates within the cluster...
Some checking operations on coordinates in order to retrieve points IDs in original array would be a bit overrated, in my opinion.

I have a bunch of thousands of points. I'd like to group them, in a user-defined number of clusters for example or based on some kind of "best number of clusters", based for example on their relatives distances. Which kind of algorithm is available on postGIS and/or shapely (preferred) to do that? I can simply make some buffers around them and a union but I wonder if there is some other advanced technique that are computationally efficient on large dataset.


Edit:

I've found the DBSCAN algorithm [1] interesting and it seems rather simple to insert in an existing code. There is also the OPTICS algorithm [2] but it seems harder to find a solution.

[1]https://en.wikipedia.org/wiki/DBSCAN
[2]https://en.wikipedia.org/wiki/OPTICS_algorithm

Anyway, I'm stuck with DBSCAN now.

How to store the cluster label in which a point lays within a new column cluster in a PostgreSQL table? Actually, the points ID are not stored along their coordinates within the cluster...
Some checking operations on coordinates in order to retrieve points IDs in original array would be a bit overrated, in my opinion.

precisions about DBSCAN algo
Source Link
swiss_knight
  • 11.4k
  • 9
  • 59
  • 142

I have a bunch of thousands of points. I'd like to group them, in a user-defined number of clusters for example or based on some kind of "best number of clusters", based for example on their relatives distances. Which kind of algorithm is available on postGIS and/or shapely (preferred) to do that? I can simply make some buffers around them and a union but I wonder if there is some other advanced technique that are computationally efficient on large dataset.


Edit:

I've found the DBSCAN algorithm [1] interesting and it seems rather simple to insert in an existing code. There is also the OPTICS algorithm [2] but it seems harder to find a solution.

[1]https://en.wikipedia.org/wiki/DBSCAN
[2]https://en.wikipedia.org/wiki/OPTICS_algorithm

Anyway, I'm stuck with DBSCAN now:
how to store the cluster label in which a point lays within a new column "cluster" in a postgreSQL table? Actually, the points ID are not stored along their coordinates within the cluster...
Some checking operations on coordinates in order to retrieve points IDs in original array would be a bit overrated, in my opinion.

I have a bunch of thousands of points. I'd like to group them, in a user-defined number of clusters for example or based on some kind of "best number of clusters", based for example on their relatives distances. Which kind of algorithm is available on postGIS and/or shapely (preferred) to do that? I can simply make some buffers around them and a union but I wonder if there is some other advanced technique that are computationally efficient on large dataset.

I have a bunch of thousands of points. I'd like to group them, in a user-defined number of clusters for example or based on some kind of "best number of clusters", based for example on their relatives distances. Which kind of algorithm is available on postGIS and/or shapely (preferred) to do that? I can simply make some buffers around them and a union but I wonder if there is some other advanced technique that are computationally efficient on large dataset.


Edit:

I've found the DBSCAN algorithm [1] interesting and it seems rather simple to insert in an existing code. There is also the OPTICS algorithm [2] but it seems harder to find a solution.

[1]https://en.wikipedia.org/wiki/DBSCAN
[2]https://en.wikipedia.org/wiki/OPTICS_algorithm

Anyway, I'm stuck with DBSCAN now:
how to store the cluster label in which a point lays within a new column "cluster" in a postgreSQL table? Actually, the points ID are not stored along their coordinates within the cluster...
Some checking operations on coordinates in order to retrieve points IDs in original array would be a bit overrated, in my opinion.

Source Link
swiss_knight
  • 11.4k
  • 9
  • 59
  • 142

Clustering geographic points

I have a bunch of thousands of points. I'd like to group them, in a user-defined number of clusters for example or based on some kind of "best number of clusters", based for example on their relatives distances. Which kind of algorithm is available on postGIS and/or shapely (preferred) to do that? I can simply make some buffers around them and a union but I wonder if there is some other advanced technique that are computationally efficient on large dataset.