2

I am trying to understand SQL commands in PostGIS.

I am trying to perform an operation which would be completed with the use of the Difference Operator in QGIS but I have large numbers of polygons so I need to manage this in PostGIS instead.

Essentially, I have some circular polygons. Let's call those buffers and I want to cut out some areas from those buffers. Let's call the areas to be cut out "blanks". The output should be the buffers minus any areas where they overlap with the blanks.

I've looked around for some suitable code to re-use and have found the code below. This works, after a fashion, BUT it doesn't give me the buffers which do not intersect with any blanks, which I also need. The problem is, I don't understand how this is working.

Why does JOIN use the expression FROM buffers b JOIN blanks a i.e. why isn't it FROM buffers JOIN blanks?

Why do we need an ST_INTERSECTS as well an ST_DIFFERENCE?

Shouldn't ST_DIFFERENCE on its own be enough?

Why is it so complicated to create what is presumably a regularly used operation?

I am new to SQL and these may be obvious questions. I've worked my way through as many tutorials as I can but they all seem to be quite basic or too advanced for me to follow. The second things progress beyond simple expressions I get lost.

CREATE TABLE buffers_minus_blanks AS
WITH temp AS 
(
  SELECT   b.id, st_union(a.geom) AS geom
  FROM     buffers b JOIN blanks a ON st_intersects(a.geom, b.geom)
  GROUP BY b.id
)
SELECT b.buffer_id, st_difference(b.geom,t.geom) AS newgeom
FROM buffers b LEFT JOIN temp t ON b.id = t.id;
menes
  • 1,421
  • 2
  • 7
  • 24
Ddee
  • 345
  • 1
  • 6
  • This question is at the ragged edge of GIS relevancy, and probably more relevant in [dba.se]. Suffice it to say that any time you have a join, you need to limit it to only the logically matching features, lest it match to every feature, returning a product of the row counts of mostly trash. I suggest you spend some time with relational algebra theory to better understand why your code needs to be structured as it is. Working with subsets should help at first. – Vince May 18 '21 at 13:00
  • 1
    I'm sorry but I'm new to all this. How is asking how to do a difference operation between two vector datasets in PostGIS not a GIS question? Lots of other people ask PostGIS questions on here. I'm just a new starter who's asking for guidance. Thx – Ddee May 18 '21 at 13:43
  • 1
  • 1
    Hi there, yes thank you that does help. But I don't understand it. I'm trying to not just copy solutions but understand what I'm doing so I can learn to do this properly. I have no idea what coalesce does in the solution you identified (thank you) or why an intersect is still needed because there's already a difference. – Ddee May 18 '21 at 14:35
  • 4
    Your questions aren't about the ST_Difference, they're about the JOIN constraint, and structuring SQL queries to return the rows you need, and now about Coalesce(). If you want to understand spatial SQL, learning about SQL first is critical. – Vince May 18 '21 at 14:39
  • 1
    When you do a left join, you keep all rows from the "left" layer and only matching rows from the "right" layer. If there is no match, the values for the "right" columns are all null. If you do an st_intersection between a valid geometry and a null one, the output is null. By using coalesce, you instruct to keep either the result of the intersection, or, if it is null, the original geometry. – JGH May 18 '21 at 14:59
  • 1
    st_intersection is an expensive function. You don't want to run it if it is not needed. By applying a join condition on intersecting geometries, you are ensuring that the intersection is only computed when needed. (try it without, it will probably work... but it may takes days instead of seconds to complete) – JGH May 18 '21 at 15:00
  • 2
    @JGH ST_Intersection isn't used in this query, ST_Intersects is; the latter can leverage the spatial index, and isn't nearly as expensive as ST_Intersection. – Vince May 19 '21 at 02:32
  • Thank you for all the advice and explanations. I shall work my way through it trying to unpick it. I note the difference between ST_Intersection and ST_Intersects. I hadn't spotted that. – Ddee May 19 '21 at 07:43
  • yeah, typo between st_difference and st_intersection. The idea remains the same – JGH May 19 '21 at 13:01

1 Answers1

2

With the help of @JGH, here is the solution I have. This seems to work nicely. Thank you

SELECT a.field_i_need, ST_Multi(COALESCE(ST_Difference(a.geom, output.geom),a.geom)) AS geom
FROM buffers AS a
CROSS JOIN LATERAL (
  SELECT ST_Union(b.geom) AS geom
  FROM blanks AS b
  WHERE ST_Intersects(a.geom, b.geom) 
) AS output ;
Ddee
  • 345
  • 1
  • 6