8

I'm trying to create a seamless vector dataset from an integer raster. "Seamless" as in: not gaps between features and no overlapping features.

The tools gdal_polygonize, pkpolygonize, terra::as.polygons (in R) all do pretty much exactly that, but they have a major flaw: They trace the raster cells precisely, resulting in overly complex features as soon as they include diagonals, curves etc.

Original raster (middle), vector data with gdal_polygonize

Cleaning these features in a post processing step (eg. ogr2ogr -simplify leads to topological errors (overlaps and gaps).

I'm very surprised that solving this turns out to be so hard. I imagine that this is a common problem in GIS workflows, e.g. the CORINE Landcover dataset is based on satellite imagery, but they also produce a vector data product with the attributes I desire: no gaps, no overlaps and simplified. How do they generate this vector data? Regrettably, I could not find information on this.

This answer Extract polygons from an image talks about using scikit-image for this. Would this be the way to go?

I'm looking for ways to solve this using python, R or the command line. The output should be something gdal compatible, preferably .gpkg or .shp.

The data I'm using can be downloaded here (4kb):

Ratnanil
  • 983
  • 8
  • 20
  • https://gis.stackexchange.com/questions/423272/smoothing-out-lines-created-by-raster-layers/423284#423284 – BERA Feb 16 '22 at 10:25
  • @BERA: These are are all feature-by-feature approaches where simplification results in overlapping polygons and gaps between polygons. I'm looking for somethings that guarantees that the resulting data set does not have gaps or overlaps, but is still simple. – Ratnanil Feb 16 '22 at 11:49
  • You can fix the gaps. For example create a big rectangular poygon covering your entire data set. Difference this and your converted to get small polygons where there are holes. Select these and Eliminate them – BERA Feb 16 '22 at 11:54
  • 1
    you need a simplification algorithm that keeps topology intact, try mapshaper – bugmenot123 Feb 17 '22 at 07:36
  • There must be a way to create seamless polygons from the get-go. On twitter, @mdsumner suggested using gdal_contour, but this approach is for continuous variables and cannot handle skipping classes. My data is huge and I really don't want to create topological errors in the first place. – Ratnanil Feb 17 '22 at 08:38
  • @bugmenot123: Thanks for the tip: I will try this and report how it goes. – Ratnanil Feb 17 '22 at 08:45
  • Being quite familiar with the production process of the corine landcover dataset, it is unfortunately not as easy as you think, and there is no one-liner solution ;) – Jascha Muller Mar 11 '22 at 15:10
  • oh thats a bummer.. I've scanned through the CLC Documentation and did not find specifics about their approach. Mind sharing your knowledge? – Ratnanil Mar 14 '22 at 06:37

1 Answers1

1

There is no simple answer to your question. Usually simplification and generalization of data makes the most sense. You can use positive and negative buffer, data union/intersection etc, depending on the needs.

There are several approaches to simplify the data, a good overview you can find in Generalization of spatial data: Principles and selected algorithms

A very simple approach could be:

r <- terra::rast("/home/sapi/Downloads/sample.tif")
terra::plot(r)

Let's take only one layer:

p <- terra::as.polygons(r)
terra::plot(p[2])


p[2] |>
  terra::buffer(width = 3) |>
  terra::simplifyGeom(tolerance = 1) |>
  terra::union() |>
  terra::buffer(width = -3) |>  
  terra::fillHoles() |>
  terra::plot()

But in practice, it depends what do you expect from your output data. What will be purpose on it? What's the desired scale? Shall be the less important features removed prior to generalization? Which one are the "less important"?

Let's remove the polygons with small area first:

p[2] |>
  sf::st_as_sf() |>
  sf::st_cast(to = "POLYGON") |>
  dplyr::mutate(area = as.numeric(sf::st_area(geometry))) |>
  subset(area >= 100) |>
  sf::st_buffer(dist = 3) |>
  sf::st_simplify() |>
  sf::st_buffer(dist = -3) |>
  subset(select = "area") |>
  terra::vect() |>
  terra::fillHoles() |>
  terra::plot()

Created on 2024-01-04 with reprex v2.0.2

There is smoothr package for R as well.

Created on 2024-01-04 with reprex v2.0.2