Extracting raster outline to a vector geometry

Question

I have a raster data set with an irregular outline (every value outside is NaN) in Python (loaded from rasterio v.1.3a3 ):

I'd like to extract its outline to a vector shape, such as a Shapely geometry or a WKT/WKB string (red in the figure below) so that, if I use that shape to clip the larger original region which is entirely containing the raster here above, I will end up on the exact same raster data set.

Is there a smarter and more direct way than extracting all the indices of the NaN values to a numpy array (but, hmm, this would be a bunch of useless data as only the ones 'at the border of where the values are' are of interest), figuring a way to apply the affine transform from the raster metadata to that numpy array (how? I can't find a natural way to do so here: https://rasterio.readthedocs.io/en/latest/api/rasterio.transform.html), and finally converting this to a shape so that's it's correctly georeferenced?

I was also thinking using OpenCV but this would probably be overkill...

@NathanThomas : would it be possible to stay inside a Python environment? The raster only exists as a numpy array, if I can avoid file system I/O it could be nice. — swiss_knight, Apr 25 '22 at 17:44
There are GDAL python bindings so you should technically be able to run the same commands but via the api: https://gdal.org/python/ — GeoMonkey, Apr 25 '22 at 19:50

user2856 · Answer 1 · 2023-02-13T05:50:29.697

2

You can use rasterio.features.dataset_features

import rasterio as rio
from rasterio.features import dataset_features
import geopandas as gpd
with rio.open(raster_path) as ds:
    shapes = list(dataset_features(ds, bidx=1, as_mask=True, geographic=False, band=False))
for shape in shapes:
    print(shape['geometry'])


#Or as a GeoDataFrame
with rio.open(raster_path) as ds:
    gdf = gpd.GeoDataFrame.from_features(dataset_features(ds, bidx=1, as_mask=True, geographic=False, band=False))
    gdf.to_file(output_path)

edited Feb 13 '23 at 05:50

answered Apr 25 '22 at 07:16

user2856

65,736
6
115
196

Thanks for you answer, unfortunately, the vector.geojson file that was generated this way is not recognized as valid when I try to load it into QGIS for example. Would it be possible to keep it in memory, e.g. as a Shapely geometry or a WKT string, or a dict (JSON-like structure) in order to, see what's happening within the Python console itself, prior to saving it to the disk, or better, to a PostGIS database? – swiss_knight Apr 25 '22 at 17:34
1

Yes, just access the shapes. Each of the shape elements is a GeoJSON feature. – user2856 Apr 25 '22 at 21:45
Did this solve your question, @swiss_knight? If so, please mark it as solution :) – bugmenot123 Feb 12 '23 at 09:16
No, actually, it does not really answer the question. First because of what I said in a previous comment: "the raster only exists as a numpy array, if I can avoid file system I/O it could be nice." and then because the output is not a vector object but a for loop on a list that prints some dictionaries whereas I really need a vector object. I ended up using a solution based on GeoPandas that works very well. If I find the time to post it here... I will. – swiss_knight Feb 12 '23 at 15:27
Actually, it does answer your question as written. You then commented that you wanted to keep the vector objects in memory, so I edited my answer to show you how to do that, i.e accessing the features/shapes. If you want a geodataframe, just put the features into one, see edit. Nowhere in your question does it specify the raster "only exists as a numpy array". In fact you actually say that "I have a raster data set" which is not a numpy array. Comments are not for adding additional information to your question. – user2856 Feb 13 '23 at 04:45

Extracting raster outline to a vector geometry

1 Answers1