Fairly new to geoprocessing in general and within Python.
I am using Python geoprocessing packages to extract meteorological information from a raster file within a watershed shapefile (the blue outline in the below image).
I followed along with the example here to determine the average raster value within the watershed successfully.
Now I want to adapt that script to extract the average raster value within the elevation band area (EBA) (the brown outlined area within the image). The format of the EBA is a multipolygon. In adapting the same script, I'm only getting a few points within the EBA. I think this might be because the resolution of the raster is too low, and so only a few of the raster squares (or their centroids?) actually lie within the EBA. Maybe the rest are getting cut out because the raster resolution is so much larger than the multipolygons.
I tried adding some lines of code to re-sample the raster at up to 100x greater resolution (which was extremely slow). This unfortunately did not improve the result.
My script is shown below.
import numpy as np
import rasterio
from rasterio import Affine # or from affine import Affine
from rasterio.mask import mask
from rasterio.enums import Resampling
import geopandas as gpd
import shapely; shapely.speedups.disable()
from shapely.geometry import Point
import os, os.path
rasterpath='C:/.../'
ebapath='C:/.../'
shapefile = gpd.read_file(str(ebapath+"EBA_ID_2.shp"), crs="EPSG:3005")
#Transform to WGS84 to match coordinate system of raster data
shapefile=shapefile.to_crs(epsg=4326)
extract the geometry in GeoJSON format
geoms = shapefile.geometry.values # list of shapely geometries
geometry = geoms[0] # shapely geometry
transform to GeJSON format
from shapely.geometry import mapping
geoms = [mapping(geoms[0])]
upscale_factor=100
filename='metfile.asc'
extract the raster values values within the polygon
Coordinate system for the raster data is WGS 84 is EPSG:4326
with rasterio.open(str(rasterpath+filename), crs="EPSG:4326") as src:
data = src.read(
out_shape=(
src.count,
int(src.height * upscale_factor),
int(src.width * upscale_factor)
),
resampling=Resampling.bilinear)
scale image transform
transform = src.transform * src.transform.scale(
(src.width / data.shape[-1]),
(src.height / data.shape[-2])
)
out_image, out_transform = mask(src, geoms, crop=True)
no data values of the original raster
no_data=src.nodata
extract the values of the masked array
data = np.array(out_image.data)[0,:,:]
extract the row, columns of the valid values
row, col = np.where(data != no_data)
ppt = np.extract(data != no_data, data)
T1 = out_transform * Affine.translation(0.5, 0.5) #0.5, 0.5 # reference the pixel centre
rc2xy = lambda r, c: (c, r) * T1
d = gpd.GeoDataFrame({'col':col,'row':row,'ppt':ppt})
coordinate transformation
d['x'] = d.apply(lambda row: rc2xy(row.row,row.col)[0], axis=1)
d['y'] = d.apply(lambda row: rc2xy(row.row,row.col)[1], axis=1)
geometry
d['geometry'] =d.apply(lambda row: Point(row['x'], row['y']), axis=1)
d.plot(column='ppt')
d.plot should return a plot with several points located within the EBA area. Instead, I get the following:
Any suggestions??? I will be doing this for 40 elevation band areas so need to automate in Python...


