This could be seen as a continuation of this question where the top accepted answer uses the spedp library in R in order to generate contiguous clusters of a raster image with multivariate data. In the code they provided (also below, with example dataset here, a raster un UTM Zone 35S containing the results of a supervised classification in 11 classes based on LANDSAT 30m imagery), as far as I understand it, a neighbor list is created with the dimensions of the raster, and then weights are calculated based on the euclidean distance and used to generate a Minimum Spanning Tree (MST) which then runs the SKATER algorithm in order to prune this tree based on feature distance. The user, unfortunately, did not provide a working example, nor the results of the code.
This is used effectively in tutorials for this package with polygons, but the problem with applying this to a raster seems to be that the MST for a raster, at least calculated with the methods available in spdep, will always provide a link where the next connected cell is the one below it until it hits an edge, so any possible pruning of the MST will result in your clusters being in a straight line rather than in a more natural shape.
Finally to the question: Is there a way to circumvent this? Or is SKATER and its associated methods simply not suited for clustering raster data?
# 1. Load packages
packs <- list("tidyverse", "raster", "spdep", "parallel")
lapply(packs, require, character.only = T)
2. Load raster layers using raster()
layers <- brick("sample.img")
r1 <- layers[[1]]
r2 <- layers[[2]]
r3 <- layers[[3]]
And so on...
3. Define the number of regions (k + 1)
k <- 10
4. Merge explanatory variables in data frame
dat <- lapply(list(r1, r2, r3), values) %>%
do.call(cbind, .) %>%
as.data.frame(., stringsAsFactors = F) %>%
magrittr::set_colnames(c("1984", "1993", "1997"))
5. Set up parallel framework for faster computation (optional)
For a non-parallel, single core execution skip steps 5 and 13
ncores <- detectCores() - 1
cl <- parallel::makeCluster(ncores, type = "PSOCK")
set.coresOption(ncores)
set.ClusterOption(cl)
6. Standardize variables
sdat <- scale(dat) %>%
as.data.frame(., stringsAsFactors = F)
7. Create neighbor list object
raster_nb <- cell2nb(nrow(r1), ncol(r1), type = "queen") # you can alternatively set contiguity to "rook"
8. Subset cells
There are various reasons for which you might need to exclude some pixels to avoid errors in subsequent functions
One example is missing values in your raster layers (e.g. due to water bodies)
complete_pixels <- which(complete.cases(sdat))
raster_nb <- subset.nb(raster_nb, 1:length(raster_nb) %in% complete_pixels)
sdat <- sdat[complete_pixels,]
9. Calculate dissimilarity between neighboring cells
lcosts <- nbcosts(raster_nb, sdat)
10. Calculate spatial weights based on dissimilarity between neighbors
raster_w <- nb2listw(raster_nb, lcosts, style = "B")
11. Obtain minimum spanning tree
raster_mst <- mstree(raster_w)
12. Run skater clustering algorithm
skater_clusters <- skater(raster_mst[,1:2], sdat, k)
13. Close parallel framework
stopCluster(cl)
14. Visualize results
Initialize an empty raster with the same dimensions
cluster_raster <- raster(layers)
cluster_raster[] <- NA
Assign cluster IDs from SKATER results
cluster_raster[complete_pixels] <- skater_clusters$groups
Visualize
plot(cluster_raster, main="Cluster Assignments")


As for the plot, It would be amazing if I was simply generating that wrong.
– Phil Mar 01 '24 at 13:24