This is my problem: I have a very large point layer with polling station locations across one country; each point belongs to a specific constituency (900k points, 4k constituencies). I wish to use this data to create polygon shapefiles of constituencies, as illustrated in the following image (no overlap between constituencies, and no blank space either). On the left, you see three point clouds (points that represent the green, blue and orange constituencies). You also see the country boundary, three blue points out of place, and one orange point out of place (raw data are not always correct, but the dataset is too large to manually correct such things). On the right hand, you see the expected result.

So far, I have explored two methods. The first is to voronoi all points, subsequently dissolve those with the same constituency variable into one polygon, and cut everything away that is outside country boundaries. Two problems: one, the orange outlier. Second, the voronois do not necessarily extend until the country boundary line, leaving blank space. This is roughly the option chosen in reply to this question: Converting point sets to polygon boundaries?, but it still leaves me with the problem of outliers and blank space at some of the edges.
The second method was to construct concave hulls (convex ones would overlap). The problem here is again both the orange outlier and even more blank space between the various constituencies. The latter could be solved through some sort of growing algorithm, I suppose - but have no idea of how those work.
Overall, my intuition is that the problem could be resolved through yet another strategy, namely some sort of clustering algorithm, which calculates the most likely constituency for every coordinate in the country, on the condition that each constituency needs to be one single area. This would both eliminate the orange outlier and grow each concave hull to maximum extent. I have just no idea how to implement something like this.
I would be interested in any automated, open source solution (or strategy) to my problem, be it in QGIS itself, via scripting (python, SQL in PostGIS), custom software.
I split my point data into one file per constituency, and transformed each into a heatmap. I then overlayed these heatmaps in QGIS, with the intention of using Raster Calculator to create a unified heatmap reflecting constituency codes, later to be re-vectorized. So far, all looks promising - until I use Raster Calculator, which only calculates values for overlaps of the original heatmaps - presumably because of NULL values.
Any idea how to get around this?
The image illustrates the problem. Dark green and red were integrated in raster calculator with "(green@1 >= red@1) * 1 + (red@1 > green@1) * 2" - dark grey is 2 and light grey is 1. The boundary on the left was drawn OK-ish, and more importantly, the single green point to the right was correctly subsumed in the dark grey (=red) constituency.
But what about all the areas where either green@1 or red@1 are NULL?

This can be solved through gdal_calculate.py which does not suffer from the NULL value problem. The final and additional question is whether anybody knows how to automate the heatmap plugin in QGIS (or whether there is a gdal-based equivalent function)?
I would hate to do this manually for 4k constituencies... Since this touches on python and qgis qnd the like, the question is separate: Can one batch-automate the creation of QGIS heatmaps?