1

I have satellite image that I want to convert into numpy array (and then to Pandas). I already know how to do that, but the problem is that it does not preserve the coordinate data.

This is how I do it now:

import xarray
src=rasterio.open('14072020.tif')
array = src.read()

pd.DataFrame(array.reshape([13,-1]).T)

I have also tried to follow this answer (keeping the coordinate system of raster files in the resulting raster file after operation with numpy) but I always get this error:

NameError: name 'gdal_array' is not defined

which does not allow me to open the image.

My end goal is to have Pandas table that contains the bands values together with the coordinate values.

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
ReutKeller
  • 2,139
  • 4
  • 30
  • 84

2 Answers2

3

You can use the open_rasterio and to_dataframe methods to accomplish that.

import rioxarray

rds = rioxarray.open_rasterio("file.tif") rds.to_dataframe()

See also: https://gis.stackexchange.com/a/358057/144357

snowman2
  • 7,321
  • 12
  • 29
  • 54
1

I would read with gdal and dump that to a numpy array. ReadAsArray().

Coordinates and projection are obtained using the GetGeotransform() SetGeotransform() and the GetProjection() and SetProjection() applied to the data set.

Example here. https://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html