0

I have a polygon and filelist of data I need to convert my polygon to raster and mask my filelist with this mask and campute the statistics in the area in polygon. I'm using GDAL Python.

I can do it using R like this :

maskraster <- rasterize(polygon, raster, mask = TRUE) 
#calcul mean of area in polygons 
Rasternew <- cellStats(maskraster,stat=mean)

How can I do the same using GDAL in Python ?

Vince
  • 20,017
  • 15
  • 45
  • 64

1 Answers1

1

I have a script which allows me to clip a raster to the polygon of a shapefile, and then compute statistics on it (including, as in this case) the mean. This example includes the modules pillow (https://pypi.org/project/Pillow/) and shapefile (https://pypi.org/project/pyshp/) on Python as well.

from osgeo import gdal, osr
import shapefile
from PIL import Image, ImageDraw

def world_to_image(lat_value,lon_value,dataset):
    '''
    This function is required for converting from lat/lon values to the corresponding values in the image coordinates.
    '''
    metadata = dataset.GetGeoTransform()
    # metadata is an array with six values: ulx, xres, xskew, uly, yskew, yres

    # The source projection (lat, long coords)
    source = osr.SpatialReference()
    source.ImportFromEPSG(4326)

    # The target projection (projection of the image)
    target = osr.SpatialReference()
    target.ImportFromWkt(dataset.GetProjection())

    # Create the transform - this can be used repeatedly
    transform = osr.CoordinateTransformation(source,target)
    new_point = transform.TransformPoint(long_value,lat_value)

    x_pixel = int( round((new_point[0] - metadata[0]) / metadata[1],0)) #matrix[:,*value*]
    y_pixel = int( round((new_point[1] - metadata[3]) / metadata[5],0)) #matrix[*value*,:]
    return y_pixel,x_pixel

dataset = gdal.Open("path/to/file")
band = dataset.GetRasterBand("band number")
band = band.ReadAsArray()
band = band.astype(float) #usually ensures no weirdness later when doing maths steps

sf = shapefile.Reader("path/to/shapefile")
minX, minY, maxX, maxY = sf.bbox #Get corner coordinates of shapefile

ulX,ulY = world_to_image(maxY,minX,dataset) #Convert shapefile corners to image 
lrX,lrY = world_to_image(minY,maxX,dataset)

pxwidth = int(lrX-ulX) #number of pixels for width of image
pxheight = int(lrY-ulY) #number of pixels for height of image
clip = band[ulY:lrY,ulX:lrX] #clipping original image to shapefile bounds

pixels = []
for p in sf.shape(0).points:
    temp = world_to_image(p[1],p[0],dataset)
    pixels.append((temp[1],temp[0])) #convert the points of the polygon to image coordinates 

rasterpoly = Image.new('L',(band.shape),1) #set up new image in the same shape as clipped raster image
rasterize = ImageDraw.Draw(rasterpoly) 
rasterize.polygon(pixels,0) #cuts empty image to exact shape of shapefile
mask = np.array(rasterpoly) #converts it to numpy array

'''
function which cuts the original raster image to the shape of the shapefile, 
leaving nodata points elsewhere.
'''
clip = gdal_array.numpy.choose(mask, (band, np.nan)).astype(gdal_array.numpy.float)

Apologies for the rather lengthy code snippet - it is cut from a longer body of code but I have tried to cut out everything unnecessary. This code should produce a 2D numpy array which you can then perform your statistical functions on. Apologies if this is not exactly the sort of thing you were looking for - it was quite hard to know exactly what you wanted from your post (as someone not familiar with R and therfore your example).

JackLidge
  • 338
  • 1
  • 2
  • 8
  • that what i need calcul cellstat that is my new question https://gis.stackexchange.com/questions/308733/calcul-calcstats-with-gdal-python-or-rasterio-for-each-bands – loula melyacou Jan 15 '19 at 18:06
  • as far as I can see that is a completely different question... If my answer above is correct please can you accept the answer :) I will have a look at the other question when I can. – JackLidge Jan 16 '19 at 10:30
  • that is good thank you and i have some questions cause i don't undersand why you are doing # The target projection (UTM zone WGS 84) , and your exemple is good for one raster with one band but my how i can do that for all 10 bands in my raster ? – loula melyacou Jan 16 '19 at 11:36
  • Ahh, that isn't necessarily true that comment, in truth that line of code will convert to whatever coordinate system is used by the input image, I will edit my answer for clarity. I haven't tried with multiple bands, but you could try putting the .GetRasterBand command into a for loop and loading them all into one numpy array. I don't know how this would affect the code below though. – JackLidge Jan 16 '19 at 11:57
  • in this line clip = gdal_array.numpy.choose(mask, (band, np.nan)).astype(gdal_array.numpy.float) this fonction is for clipping image with mask without nan value ( 0 value ) ? – loula melyacou Jan 16 '19 at 15:18
  • if i undersand your code will be reroject the shapfile with same project of raster and creat rastersize and mask and clip raster after with the mask without value nan ! and how can i calcul statistic ( mean of value without nan and 0 value ) – loula melyacou Jan 16 '19 at 15:37
  • i need to install shapfile with anaconda ? – loula melyacou Jan 16 '19 at 15:39
  • i m install fonction pyshp with anaconda but i have this error import shapefile no module named shapfile – loula melyacou Jan 16 '19 at 23:24
  • so the queried line is to clip the image to the shapefile projection, setting all values outside the shapefile area to be np.nan (this will not affect your values instead the shapefile area). Your should be able to do something like np.nanmean to get the mean of the band ignoring nan values. – JackLidge Jan 17 '19 at 10:13
  • I would suggest installing shapefile using "pip install shapefile" as I don't think it has a conda install set up yet. It's not the best practise to install programs to run on anaconda with pip, but it doesn't appear to have broken anything in my install. You'll need to install shapefile from the command line – JackLidge Jan 17 '19 at 10:15
  • i'm installing ppyshp with pip and that good – loula melyacou Jan 17 '19 at 12:17
  • sf = shapefile.Reader("path/to/shapefile") minX, minY, maxX, maxY = sf.bbox #Get corner coordinates of shapefileUnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) with this line i had error xUnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) – loula melyacou Jan 18 '19 at 10:34
  • I'm not sure what that problem might be caused by, probably something wrong with the shapefile you are using. Perhaps might be better served by starting a new question to address that error? – JackLidge Jan 18 '19 at 11:01
  • i can't undersant the script realy if u have any exemple with raster and shapfile or other script to create rastersize and mask and clip raster ?! thank u – loula melyacou Jan 18 '19 at 11:25