"Copy" Image coordinates to another image that is nd.array

Question

I have image with three bands. I have script predicts the first band pixels values by using the two other bands and in the end generates an image.The generated image is at the beginning pandas table, that transformed into nd.array and then displayed and saves as tif image using imageio.

My problem is that during this processing I lose the coordinates so the result image needs to be georeferencing. During the process I use reshape in order to get one table of all the pixels and their 3 values and I believe this is where I lose the coordinates but I don't know how can I keep them , if they should become a column in the new table? or they stores somehow different?

My script:

#open the raster I have download before
img=rasterio.open("img_new.tif")
show(img,0)
#create pandas df where each pixel is a row, the column are the bands
#probably here i'm losing the coordinates
df_all=pd.DataFrame(array.reshape([3,-1]).T)
df_all
#use random forest regressor to predict the first band by bands 2,3:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
rf=RandomForestRegressor()
y_train=y_train.values.ravel()
rf.fit(X_train,y_train)
rf_pred=rf.predict(X_test)
rf_pred
#apply rpediction for all the data
pred_all=rf.predict(data)
#.....creating new df table with the prediction value for all....
df_join=df_all.merge(df,how='left',left_on='index', right_on='index')
#convert back to image that doesn't have the coordinates:
#convert to numpy
rf_array=df_join['Prediction'].values
rf_array
#reshape
rf_array=rf_array.reshape(869,1202)
plt.imshow(rf_array)

I thought maybe if I could tell python to choose some pixels from the first image, save their value and paste it later somehow in the generates result image it could work .

My end goal: to "copy" the coordinates from the first image to the result image , so when I open the two images in QGIS they will overlap.

Edit: just to clarify: the result image is constructed from numpy.ndarray that I save with imageio as image.

Edit2: I have c found the img.transform and could get it from the original image:

img.transform
>>>Affine(10.0, 0.0, 208810.0,
       0.0, -10.0, 7583530.0)

but now I don't know how to get this coordinates to be pasted on the result image.

Edit3: definition of X, y, df, array:

#Definition of df
#df is the pandas dataframe, constructued from the original tiff:
img=rasterio.open("image_original.tif")
#array
#shape
array=img.read()
#create pandas df
df=pd.DataFrame(array.reshape([3,-1]).T)
df.columns=['band1','band2','band3']
df=df.reset_index()
df
#define X and y, y is the predicted values (I wanted to rpedict y using columns X #with Random Forest)
X = df.iloc[:, 2:]
y = df.iloc[:,1:2]

Kadir Şahbaz · Accepted Answer · 2020-06-06T16:00:19.203

11

Save result array (rf_array) as in the following lines:

rf_array = df_join['Prediction'].values # returns numpy.ndarray
rf_array = rf_array.reshape(img.shape[0], img.shape[1])
# rf_array = rf_array.reshape(869, 1202)

with rasterio.open('path/to/new.tif', 
                   'w',
                   driver='GTiff',
                   height=rf_array.shape[0],
                   width=rf_array.shape[1],
                   count=1,
                   dtype=rf_array.dtype,
                   crs=img.crs,
                   nodata=None, # change if data has nodata value
                   transform=img.transform) as new_file:

    new_file.write(rf_array, 1)

edited Jun 06 '20 at 16:00

answered Jun 06 '20 at 14:02

Kadir Şahbaz

76,800
56
247
389

thank you for your answer, it didn't work probably because I didn't understood few things: what is the 'w'? why we don't do the reshape and how the reshape workd in the height and width part? and if I have nodata, how it works? when I used this code I got error IndexError: tuple index out of range. – ReutKeller Jun 07 '20 at 15:33
'w' means 'write'. If you have no data, no problem. I cannot test your code. It has missing part. X, y, df, array are missing. Where did you define them? df_join['Prediction'].values returns numpy array. And with rasterio.open(..) with w creates a new empty image based on parameters. write function writes rf_array to the image. – Kadir Şahbaz Jun 07 '20 at 19:05
When you run img.crs, what do you get? Which line gives IndexError: tuple index out of range.? When I use this lines, my rf_array is written to file with crs. – Kadir Şahbaz Jun 08 '20 at 08:05
I found the problem , I have skipped rf_array = rf_array.reshape(img.shape[0], img.shape[1]) , but now I have problem - I get error of "no error" : CPLE_OpenFailedError: Attempt to create new tiff file 'some\ath\image.tif' failed: No error – ReutKeller Jun 11 '20 at 07:45
and also, if I know that I have pixels that have no data, does chaning to True is the right step? – ReutKeller Jun 11 '20 at 07:47
and what is "count" standing for? – ReutKeller Jun 11 '20 at 07:52
count stands for band count. if you know nodata value(for example -9999), specify it. otherwise, keep it None. you can set it in QGIS later. I mentioned before, you script has missing parts, so, I cannot test your script. the solution works on my own toy data – Kadir Şahbaz Jun 11 '20 at 09:01
I figured out th eproblem and it worked, thank you – ReutKeller Jun 11 '20 at 10:28

radouxju · Answer 2 · 2020-06-06T11:29:05.417

You must define 2 elements in order to have a geolocated image

The first is the geotransform that converts the row/column coordinates into X/Y coordinates. In your case this will be done using SetGeotransform. The geotransform is a vector with X coordinate of the origin, the size in X from column value, change in X from row value , the Y coordinate of origin, the change in Y by column value, the size in Y by row value. As you can see, this is not the same order as in the affine transform, which is : a = width of a pixel b = row rotation (typically zero) c = x-coordinate of the upper-left corner of the upper-left pixel d = column rotation (typically zero) e = height of a pixel (typically negative) f = y-coordinate of the of the upper-left corner of the upper-left pixel

So in your case the geotransform will be:

dataset.SetGeoTransform([208810,10,0,7583530,0,-10,])

so that

Xgeo = GT(0) + colval*GT(1) + rowval*GT(2) 
Ygeo = GT(3) + colval*GT(4) + rowval*GT(5)

The second is the coordinate system corresponding to your image

You could define it based on the EPSG code, e.g.

srs = osr.SpatialReference()
srs.ImportFromEPSG(your_EPSG_code) 
dataset.SetProjection(srs.ExportToWkt())

or get it from another dataset

dataset.SetProjection(inputdataset.GetProjection())

thank you for your answer, it didn't work for me maybe probably because i'm doing something wrong. I tried to use the SetGeoTransform before the reshape of the dataset I want to reproject and also without reshape but I keep getting the error "AttributeError: 'numpy.ndarray' object has no attribute 'SetGeoTransform'" , also, can you please elaborate regard the part of the osr.spatialreference? what is OSR? (I just couldn't try it yet because couldn't do the transform) — ReutKeller, Jun 11 '20 at 07:38

"Copy" Image coordinates to another image that is nd.array

2 Answers2

Linked