Calculating zonal statistics in R with the extract function in the raster package, I found unexpected behaviour. I have a raster layer of rather large cell size and a polygon layer of relatively small, quadratic polygons. As an example, I want to use these nine polygons 19 - 27 with the raster plotted as background:
extract(raster, polygon, weights=TRUE, normalizeWeights=FALSE)
yields the following result:
[[19]]
value weight
[1,] -102.39999 0.06
[2,] -92.79999 0.03
[[20]]
value weight
-92.79999 0.06
[[21]]
value weight
-92.79999 0.09
[[22]]
value weight
[1,] -102.39999 0.04
[2,] -92.79999 0.02
[[23]]
value weight
-92.79999 0.04
[[24]]
value weight
-92.79999 0.06
[[25]]
value weight
[1,] -88 0.06
[2,] -86 0.03
[[26]]
value weight
-86 0.06
[[27]]
value weight
-86 0.09
Apparently, the function collects the values of the raster cells overlapping with each polygon and assigns each value a weight corresponding to the area covered by the respective coverage within a polygon. If that worked perfectly, I'd be absolutely satisfied. However, there are some oddities:
- Why is weight different for polygons 20 and 21, 26 and 27?
- Why are weights ≠ 1 when a polygon falls completely within a raster cell?
- Why does polygon 22 only get 2 values instead of 4 (and polygon 23 and 24 1 value each instead of 2)?
The help page states that weights "returns, for each polygon, a matrix with the cell values and the approximate fraction of each cell that is covered by the polygon(rounded to 1/100)". I suppose that weight rather uses a distance measure of the polygon to the raster cell centroid than an actual coverage fraction (?). When using normalizeWeigths=TRUE, this might become irrelevant. However, partly ignoring raster cells that fall within a polygon doesn't seem irrelevant at all.
Does anybody understand why this happens and how to solve this problem?

I interpolated the input raster (bicubic interpolation) so to obtain a reasonable cell-size.
– yenats Oct 27 '17 at 15:09extractthen was able to calculate correct mean values for each polygon. However - I'd really like to understand why the function is not able to deal with large raster cell sizes correctly.exactextractrpackage for these operations. It returns a precise intersection between the polygon and each given cell and returns the fraction of the cell covered by the polygon. Since it is written in C++ it also has the advantage of being quite a bit faster thatraster::extract. You may want to also look at the beta version ofterrathat just came out, it is intended as a replacement for therasterpackage and is written in C++ (same developer asraster). – Jeffrey Evans Apr 27 '20 at 22:48