I have a raster image and I want to construct a new vector layer that has the boundary of the raster, using python. Below is a sample of the image.
The problem is that the nodata value is set to "None" and the white pixels that constitute empty space are coded as 255. So that means the mask is not going to mask out the whitespace as areas of nodata.
Just to confirm I checked rasterio I found the following:
import rasterio
src = rasterio.open('../image.tif', mode='r')
data_mask = src.read_masks(1)
data = src.read(1)
data looks like showing that the upper left region of the image is white and empty with the 255 value.
array([[255, 255, 255, ..., 254, 255, 255],
[255, 255, 255, ..., 254, 255, 255],
[255, 255, 255, ..., 254, 255, 255],
...,
[255, 255, 255, ..., 248, 252, 248],
[254, 254, 254, ..., 255, 235, 253],
[255, 255, 255, ..., 255, 255, 245]], dtype=uint8)
And the data_mask looks like the white in the upper left of the image is valid. These values should read as 0 but instead they are 255--meaning they are valid.:
array([[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
...,
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255]], dtype=uint8)
So to extract the boundary polygon I tried to use the code from another SE post.
gdal_translate -b mask -of vrt -a_nodata 0 test.tif test.vrt
gdal_translate -b 1 -of vrt -a_nodata 0 test.vrt test2.vrt
gdal_polygonize.py -q -8 test2.vrt -b 1 -f "ESRI Shapefile" testdata.shp
as well as the alternative code:
gdalwarp -dstnodata 0 -dstalpha -of GTiff foo1 foo2
gdal_polygonize.py foo2 -b 2 -f "ESRI Shapefile" foo3
But both of these approaches produce a boundary polygon that reflects the mask, which includes the whitespace.
So I need to know how to adjust this code to first mark the pixels that are 255,255,255 as nodata and then I should be able to run the code above right. The other issue is that the resulting polygon will probably have little holes in it corresponding to areas of white in the image.
Is there a good way to filter those out when creating the polygon?

A==255to write out a mask as it's own raster, then polygonize that – mikewatt Jul 17 '19 at 17:49gdal_calcbefore, but just looked at the documentation. So would I do something likegdal_calc -A image.tif --calc="A==255"or something like that? – krishnab Jul 17 '19 at 17:55--outfile=<filename>of course – mikewatt Jul 17 '19 at 18:10A==255will that mark the image region as the new valid region, or will it set the white region as the valid region? I am running the code now, so we will see shortly, but just wanted to understand how the equation acts on the output raster. – krishnab Jul 17 '19 at 19:40Truefor pixels that satisfy that expression, and when converting to integerTrueis equivalent to 1. So white areas will be 1 and anything else will be 0. Obviously!=will flip that if necessary – mikewatt Jul 17 '19 at 22:00