Only union/dissolve intersecting or adjacent features to speed up query

Question

I have a table with about 300k polygons, some are overlapping or adjacent to each other and some are islands of their own. I need to dissolve the overlapping/adjacent ones and also keep the non-overlapping ones. I only need the geometries, no attributes. I can do this with:

create table polygons123_dissolved as
(select (st_dump(st_union(polygons123.wkb_geometry))).geom as geom
from polygons123)

Example:

But this is very slow, more than an hour. How can i speed it up?

Explain output:

Result  (cost=37684.32..37949.59 rows=1000 width=32)
  ->  ProjectSet  (cost=37684.32..37689.59 rows=1000 width=32)
        ->  Aggregate  (cost=37684.32..37684.33 rows=1 width=32)
              ->  Seq Scan on polygons123  (cost=0.00..36538.54 rows=229155 width=849)
                    Filter: ((group_id = 1) AND (type_id = ANY ('{1,2,3}'::integer[])))

You could do a sub-query to find all polygons that intersect another one and then use this in a ST_Union(geom) WHERE id IN (ids) where your ids come from the subquery. However, I think ST_Union already uses spatial indexing under the hood, see this article from Paul Ramsey, so it is not clear to me that a subquery will actually be any quicker. Depending on the polygon complexity and how many intersect, 20 minutes for 300k is not necessarily that awful. Perhaps post your EXPLAIN output. — John Powell, Nov 07 '18 at 08:42
the first step of the dissolve consists in finding overlapping polygons, so I don't think that you will gain much by splitting the steps, except if you reuse the dissolve — radouxju, Nov 07 '18 at 08:42
Check your source data for self-intersections, this is often a mistake... — Cyril Mikhalchenko, Nov 07 '18 at 18:12
And show your EXPLAIN. Runtimes are not that helpful, as they vary so much depending on hardware and polygon complexity. — John Powell, Nov 07 '18 at 18:25

score 11 · Accepted Answer · edited Nov 20 '20 at 13:25

I would look into using ST_ClusterDBSCAN. I have had tremendous success using this function to solve many cluster like geometric problems.

WITH clusters 
  AS(select st_clusterdbscan(geom, 0, 2) over() cluster_id, geom from table
  )
select st_union(geom) geom from clusters
where cluster_id is not null
group by cluster_id
union
select geom from clusters where cluster_id is null

In this example the clusters cte will assign an id to each polygon that intersects another polygon with a distance of 0. Essentially any polygons that intersect another polygon will be added to a unique cluster group. the third parameter specifies how many intersecting polygons there needs to be for the cluster to be created. I chose 2 for this example. so a polygon that is alone will result in a null value, you could wrap COALESCE around this whole function and assign a 0 to the single polygons.

after the cte you should simply st_union the geometries that are not null and make sure to group by the cluster id number. then merge/union with the rest of the single polygons.

Only union/dissolve intersecting or adjacent features to speed up query

1 Answers1

Linked