Dissolve/Union large number (1.2m ) polygons in PostGIS

Question

I am very new to PostGIS. I read that it should work faster than QGIS, so I thought I'd give it a go. I want to completely dissolve a shapefile of 1.2m polygons in the same way as QGIS built in dissolve function works.

This is my current code, which is very basic (filetodissolve is the table):

SELECT ST_Union(geom)
FROM filetodissolve f;

I've been running this for 1h 30m now and is showing no sign of stopping. Is there any method to speed this up.

@BERA What code do I need to change in that answer. Just where it says 'table'? — Melanie Baker, Feb 02 '23 at 14:44
Yes to the name of your table. With schema, for example public.yourtablename — BERA, Feb 02 '23 at 14:45
If you do not need extremely accurate result, try this trick https://gis.stackexchange.com/questions/222976/cleaning-large-shapefile-using-v-clean-in-order-to-dissolve-features. — user30184, Feb 02 '23 at 14:48
"PostGIS is faster than QGIS" is fake news, or at least sufficiently stark and unnuanced to not address reality. — Vince, Feb 02 '23 at 14:54
@Vince Just looking for a quicker method/a method that actually works. Happy to hear other options. — Melanie Baker, Feb 02 '23 at 14:56
@BERA will the code make much difference if most of my polygons overlap? — Melanie Baker, Feb 02 '23 at 15:31
I found a suggestion to use ST_Buffer(St_Collect(wkb_geometry), 0) in some old comment. That might also be worth trying. Using SET work_mem=50000; for giving more memory was also suggested. If you test with 10000 or 100000 features you will get preliminary results faster. — user30184, Feb 02 '23 at 15:55
Before adding a spatial index it might be good to check if it is missing https://gis.stackexchange.com/questions/241599/finding-postgis-tables-that-are-missing-indexes. — user30184, Feb 02 '23 at 16:37
You haven't talked about the complexity of the polygons to be unioned nor about their connectivity. If processing in one go is too much, you have to break it down in smaller batches. This could be via clusters as shown by Bera, or by using a grid and computing for each quadrant. The more complex as the polygons, the smaller should be the quadrant. Once done, do it again using the previously unioned polygons and a bigger quadrant. But in any cases it is very important to work on nearby geometries — JGH, Feb 03 '23 at 14:31
What version of PostGIS are you using? The more recent versions of PostGIS/GEOS might provided faster unioning, due to some improvements in the implementation. — dr_jts, Feb 03 '23 at 16:39
Is there any way you can share the data? I'm curious to see what it looks like, and experiment with the union. — dr_jts, Feb 03 '23 at 16:43
@dr_jts I don't think I can share the data due to its owners data agreements unfortunately, but thankyou! — Melanie Baker, Feb 06 '23 at 08:14

BERA · Accepted Answer · 2023-02-02T18:52:05.507

5

This is basically the same answer as to this question. It uses ST_ClusterDBSCAN to assign each cluster of intersecting/adjacent polygons an id and union based on id:

create index table123_index on test.table123 using GIST(geom); --Make sure you have a spatial index
create table test.table123_dissolved as
with clusters
    as (select st_clusterdbscan(geom,0,2) over() cluster_id, geom from test.table123)
select st_union(geom) geom 
from clusters 
where cluster_id is not null --Where there are adjacent polygons that have been assigned a cluster id
group by cluster_id
union
select geom from clusters where cluster_id is null --Polygons that are separate from all others get no cluster id
;
alter table test.table123_dissolved add column id serial;

With my test data it finishes in 120 s for 3 million features

(I canceled Dissolve in QGIS after 20 min/35 % finished.)

edited Feb 02 '23 at 18:52

answered Feb 02 '23 at 18:14

BERA

72,339
13
72
161

Thankyou. I have been running the code for 18hs now. I have almost 1.3m polygons. The polygon layer is 1.2GB and is the extent of the UK. Will any of this change the speed of the union? – Melanie Baker Feb 03 '23 at 08:52
Do you have large complex multipolygons? Try converting them to single parts using ST_DUMP. But first try my answer of a subset of your 1.3 million polygons to make sure it is working – BERA Feb 03 '23 at 09:05
1

No I did a multipart to single part conversion in QGIS before I loaded the file into PostGIS/pgAgmin4 – Melanie Baker Feb 03 '23 at 09:11
1

I dont know if it makes any difference but you could try subdividing the geometries – BERA Feb 03 '23 at 09:14
This is a good approach if the data contains disjoint clumps of polygons. Probably won't help if all/many of the polygons are touching, though. – dr_jts Feb 03 '23 at 16:42
In addition to the previous comments, clustering the spatial index could also be tested - postgis.net/workshops/postgis-intro/clusterindex.html. – Brent Edwards Jun 09 '23 at 16:38

Dissolve/Union large number (1.2m ) polygons in PostGIS

1 Answers1