Context:
I have a geopandas dataframe, with two columns
'geometry': full ofshapely.geometry.point.Point.'h_mean': gathering the height of a surface at those points
I want to create a 2D spatial grid which spatial footprint englobes the points, and where the lower left and upper right corners are the min and max points in the 'geometry' column. Then, I want to populate in this grid (or array) the 'h_mean' values of each point based on its coordinates.
Problem: How to code efficiently the grid generation and the population of the grid ?
How I do it:
My code is divided in 3 parts:
- Generating the geopandas dataframe (just for the example, it is different in my real code)
- Generating the grid (by using a Haversine function in order to calculate the shape of my grid)
- Populating the grid (reusing the grid's shape to populate rather than
xarray's.sel()function that populates the grid by comparing to the closest coordinates.
Code:
import random
import geopandas as gpd
from shapely.geometry import Point
import numpy as np
----- GENERATE GEOPANDAS DATAFRAME -----
Lower left and upper right coordinates of the AOI
x_min = -140.51050737448583
x_max = -140.2348344860644
y_min = 59.7891540534345
y_max = 60.5585355551882
Resolution of 30m
res = 30
Number of random points to generate
num_random_points = 20 # Adjust as needed
Lists to store generated heights and points
heights = []
points = []
Generate random heights and points within the specified range
for _ in range(num_random_points):
# Generate a random height between 0 and 1000 (adjust as needed)
height = random.uniform(0, 1000)
heights.append(height)
# Generate random coordinates within the specified range
x = random.uniform(x_min, x_max)
y = random.uniform(y_min, y_max)
# Create a Shapely Point object
point = Point(x, y)
points.append(point)
Include the minimum and maximum points
min_point = Point(x_min, y_min)
max_point = Point(x_max, y_max)
Add the minimum and maximum points to the lists
points.extend([min_point, max_point])
heights.extend([0, 0]) # Assign heights of 0 to the min and max points
Create a GeoDataFrame from the generated data
data = {'geometry': points, 'h_mean': heights}
gdf = gpd.GeoDataFrame(data, crs='EPSG:4326')
Print the generated GeoDataFrame
print(gdf)
----- GENERATE GRID -----
import math
Function to calculate the distance between two points
def haversine_distance(lon1, lon2):
# Convert longitude from degrees to radians
lon1 = math.radians(lon1)
lon2 = math.radians(lon2)
# Radius of the Earth in meters
earth_radius = 6371000 # Approximately 6,371 km
# Haversine formula
dlon = lon2 - lon1
a = math.sin(dlon / 2)**2
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
distance = earth_radius * c
return distance
Calculate the amount of cells of 30m spanning the x and y dimensions
dx = int(np.ceil(haversine_distance(x_max, x_min)/res))
dy = int(np.ceil(haversine_distance(y_max, y_min)/res))
Create an array with dimensions dx by dy
grid_values = np.full((dx, dy), np.nan)
Calculate the x and y steps between each 30m cell
step_x = np.abs((x_max - x_min))/dx
step_y = np.abs((y_max - y_min))/dy
Create the x and y coordinates arrays
x_coords = np.array([x_min + i * step_x for i in range(dx)])
y_coords = np.array([y_min + i * step_y for i in range(dy)])
----- POPULATE GRID -----
Iterate through points and populate the grid
for point, height in zip(points, heights):
x_idx = int((point.x - x_min) / step_x)
y_idx = int((point.y - y_min) / step_y)
# Check if the indices are within the grid dimensions
if 0 <= x_idx < dx and 0 <= y_idx < dy:
grid_values[x_idx, y_idx] = height