Growing several concave hulls until they completely fill larger area without overlap using QGIS?

Question

This is my problem: I have a very large point layer with polling station locations across one country; each point belongs to a specific constituency (900k points, 4k constituencies). I wish to use this data to create polygon shapefiles of constituencies, as illustrated in the following image (no overlap between constituencies, and no blank space either). On the left, you see three point clouds (points that represent the green, blue and orange constituencies). You also see the country boundary, three blue points out of place, and one orange point out of place (raw data are not always correct, but the dataset is too large to manually correct such things). On the right hand, you see the expected result.

from point clouds to adjacent polygons

So far, I have explored two methods. The first is to voronoi all points, subsequently dissolve those with the same constituency variable into one polygon, and cut everything away that is outside country boundaries. Two problems: one, the orange outlier. Second, the voronois do not necessarily extend until the country boundary line, leaving blank space. This is roughly the option chosen in reply to this question: Converting point sets to polygon boundaries?, but it still leaves me with the problem of outliers and blank space at some of the edges.

The second method was to construct concave hulls (convex ones would overlap). The problem here is again both the orange outlier and even more blank space between the various constituencies. The latter could be solved through some sort of growing algorithm, I suppose - but have no idea of how those work.

Overall, my intuition is that the problem could be resolved through yet another strategy, namely some sort of clustering algorithm, which calculates the most likely constituency for every coordinate in the country, on the condition that each constituency needs to be one single area. This would both eliminate the orange outlier and grow each concave hull to maximum extent. I have just no idea how to implement something like this.

I would be interested in any automated, open source solution (or strategy) to my problem, be it in QGIS itself, via scripting (python, SQL in PostGIS), custom software.

I split my point data into one file per constituency, and transformed each into a heatmap. I then overlayed these heatmaps in QGIS, with the intention of using Raster Calculator to create a unified heatmap reflecting constituency codes, later to be re-vectorized. So far, all looks promising - until I use Raster Calculator, which only calculates values for overlaps of the original heatmaps - presumably because of NULL values.

Any idea how to get around this?

The image illustrates the problem. Dark green and red were integrated in raster calculator with "(green@1 >= red@1) * 1 + (red@1 > green@1) * 2" - dark grey is 2 and light grey is 1. The boundary on the left was drawn OK-ish, and more importantly, the single green point to the right was correctly subsumed in the dark grey (=red) constituency.

But what about all the areas where either green@1 or red@1 are NULL?

enter image description here

This can be solved through gdal_calculate.py which does not suffer from the NULL value problem. The final and additional question is whether anybody knows how to automate the heatmap plugin in QGIS (or whether there is a gdal-based equivalent function)?

I would hate to do this manually for 4k constituencies... Since this touches on python and qgis qnd the like, the question is separate: Can one batch-automate the creation of QGIS heatmaps?

radouxju · Accepted Answer · 2014-03-09T19:31:20.443

The use of Thiessen (= Voronoï) came first to my mind when I read your title. I think that you go a few step further to remove your inclusions and clip to the desired extent once you have dissolved your polygons.

But your need to handle outliers makes me think of a classification problem, so most classification algorithms could help you to draw the boundaries. Have a look at the SVM classifier (with soft margins and radial basis function kernel) where you would consider X and Y as the input feature. More info on libSVM (more details would be out of scope, have a look in cross validated if you have problems to implement).

For a more "GIS like" method, you could assign your zones based on the highest point density. You can use the QGIS heatmap plugin to compute the density for each "color" in your example (note that the key is the selection of the radius). Then you look for the color with the largest density with QGIS raster calculator. (by the way, this would be a kind of maximum likelihood classifier)

Growing several concave hulls until they completely fill larger area without overlap using QGIS?

1 Answers1