7

I have a PostGIS (in postgres 9.3) table which has Points. I want to figure out the "clusters" in the data. If 2 points are < X distance apart, they should be added to a cluster, and this should be done recursively. In essense, a cluster is a subset of the points, where each point is within X distance of at least one other point in the cluster.

However I cannot think of any way to do this with a PostgreSQL query. Is there any single query to do this, or do I need to write a programme that will externally query the data, and build up the clusters, and continue to query for who knows how many steps?

Amandasaurus
  • 892
  • 3
  • 10
  • 20
  • I think you should look at this question which look way similar :http://gis.stackexchange.com/questions/11567/spatial-clustering-with-postgis. However I'm not sure if this must be tagged as duplicate as all answer are pretty old and maybe with nowadays PostgreSQL one might develop a new approach and overcome limitations mentioned about the accepted answer. – MarHoff Jan 14 '16 at 14:10
  • Whhooohooo thx for all answers bellow can't wait to upgrade to 2.2, new functions look so great! – MarHoff Jan 14 '16 at 22:57

2 Answers2

6

If you can updgrade to postgis 2.2.0 you might be lucky because that feature has just been introduced. From the doc: "ST_ClusterWithin is an aggregate function that returns an array of GeometryCollections, where each GeometryCollection represents a set of geometries separated by no more than the specified distance."

Check this: http://postgis.net/docs/manual-2.2/ST_ClusterWithin.html

tilt
  • 3,269
  • 13
  • 24
  • 3
    Go for PostGIS 2.2.1 if at all possible; the performance of the clustering functions in this release is far better than in 2.2.0. – dbaston Jan 14 '16 at 19:06
5

This technique will work in versions of PostGIS < 2.2.0. Create cluster polygons by buffering the points by half your cluster distance, assign a cluster ID, then transfer the cluster ID to the points the polygons intersect:

--Buffer all points by the half the cluster distance (in this case 40 metres),
--merge into a multipolygon using st_union(), 
--then separate into single parts with st_dump() and .geom

create table clusterpolys1 as
select (st_dump(st_union(st_buffer(geom, 40)))).geom from points1;

--Add a unique cluster ID to the cluster polygons
alter table clusterpolys1 add column cluster_id serial;

--Transfer the cluster ID to the points
alter table points1 add column cluster_id integer;

update points1 a
set cluster_id = b.cluster_id
from  clusterpolys1 b
where st_intersects(a.geom, b.geom);

enter image description here

chill
  • 81
  • 2