I have image with three bands. I have script predicts the first band pixels values by using the two other bands and in the end generates an image.The generated image is at the beginning pandas table, that transformed into nd.array and then displayed and saves as tif image using imageio.
My problem is that during this processing I lose the coordinates so the result image needs to be georeferencing. During the process I use reshape in order to get one table of all the pixels and their 3 values and I believe this is where I lose the coordinates but I don't know how can I keep them , if they should become a column in the new table? or they stores somehow different?
My script:
#open the raster I have download before
img=rasterio.open("img_new.tif")
show(img,0)
#create pandas df where each pixel is a row, the column are the bands
#probably here i'm losing the coordinates
df_all=pd.DataFrame(array.reshape([3,-1]).T)
df_all
#use random forest regressor to predict the first band by bands 2,3:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
rf=RandomForestRegressor()
y_train=y_train.values.ravel()
rf.fit(X_train,y_train)
rf_pred=rf.predict(X_test)
rf_pred
#apply rpediction for all the data
pred_all=rf.predict(data)
#.....creating new df table with the prediction value for all....
df_join=df_all.merge(df,how='left',left_on='index', right_on='index')
#convert back to image that doesn't have the coordinates:
#convert to numpy
rf_array=df_join['Prediction'].values
rf_array
#reshape
rf_array=rf_array.reshape(869,1202)
plt.imshow(rf_array)
I thought maybe if I could tell python to choose some pixels from the first image, save their value and paste it later somehow in the generates result image it could work .
My end goal: to "copy" the coordinates from the first image to the result image , so when I open the two images in QGIS they will overlap.
Edit: just to clarify: the result image is constructed from numpy.ndarray that I save with imageio as image.
Edit2: I have c found the img.transform and could get it from the original image:
img.transform
>>>Affine(10.0, 0.0, 208810.0,
0.0, -10.0, 7583530.0)
but now I don't know how to get this coordinates to be pasted on the result image.
Edit3: definition of X, y, df, array:
#Definition of df
#df is the pandas dataframe, constructued from the original tiff:
img=rasterio.open("image_original.tif")
#array
#shape
array=img.read()
#create pandas df
df=pd.DataFrame(array.reshape([3,-1]).T)
df.columns=['band1','band2','band3']
df=df.reset_index()
df
#define X and y, y is the predicted values (I wanted to rpedict y using columns X #with Random Forest)
X = df.iloc[:, 2:]
y = df.iloc[:,1:2]
X,y,df,arrayare missing. Where did you define them?df_join['Prediction'].valuesreturns numpy array. Andwith rasterio.open(..)withwcreates a new empty image based on parameters.writefunction writesrf_arrayto the image. – Kadir Şahbaz Jun 07 '20 at 19:05img.crs, what do you get? Which line givesIndexError: tuple index out of range.? When I use this lines, myrf_arrayis written to file with crs. – Kadir Şahbaz Jun 08 '20 at 08:05countstands for band count. if you know nodata value(for example -9999), specify it. otherwise, keep itNone. you can set it in QGIS later. I mentioned before, you script has missing parts, so, I cannot test your script. the solution works on my own toy data – Kadir Şahbaz Jun 11 '20 at 09:01