0

Given a large spatial database of trees (600,000) find the tree that within radius r encircles the greatest variety of types of trees.

Looking at an answer by andyteucher I've simulated some tree data below using random trees with types A to I. I've then used buffer to draw a circle around each tree and intersect that with all the trees. Not sure what's next or if this direction is appropriate.

library(sf)
library(tmap)
library(tidyverse)
library(leaflet)
library(stringi)

mean_lat <- 43.9
sd_lat <- 0.1
mean_long <- -79.384293
sd_long <-  0.1

set.seed(42)
trees <- data.frame(lat = rnorm(500, mean = mean_lat, sd = sd_lat),
                      long = rnorm(500, mean = mean_long, sd = sd_long))

trees <- mutate(trees, tree_type = stri_rand_strings(500, 1, pattern = "[A-I]"))

radius = 500

# Convert to sf, set the crs to EPSG:4326 (lat/long), 
# and transform CRS to Lambert conformal conic projection - North America
tree_sf <- st_as_sf(trees, coords = c("long", "lat"), crs = 4326) %>% 
  st_transform(102009)

# Buffer circles by 100m
tree_circles <- st_buffer(tree_sf, dist = radius)
enclosed_trees <- st_intersection(tree_sf, tree_circles)

# Transform back to 4326 to make Leaflet happy
tree_circles <- st_transform(tree_circles, 4326)
enclosed_trees <- st_transform(enclosed_trees,4326)

leaflet() %>% 
  addTiles() %>%
  addPolygons(data = tree_circles) %>% 
  addCircles(data = enclosed_trees)
ixodid
  • 433
  • 1
  • 6
  • 15

1 Answers1

1

This uses dplyr, although I could give you a solution that doesn't require dplyr if necessary.

First find the number of unique for each tree:

library(dplyr)

tree_sf$number_unique <- lapply(1:nrow(tree_sf), function(x){
  st_intersection(tree_sf[x,] %>% 
                    st_buffer(radius) %>% 
                    select(-tree_type), tree_sf)$tree_type %>%
    unique %>% 
    length
}) %>% unlist

or with purrr

library(purrr)
tree_sf$number_unique <- map_dbl(1:nrow(tree_sf), ~ {
  st_intersection(tree_sf[.,] %>% 
                    st_buffer(radius) %>% 
                    select(-tree_type), tree_sf)$tree_type %>%
    unique %>% 
    length
}) 

now get the tree with most variety:

tree_sf %>% arrange(number_unique) %>% slice(1)
sebdalgarno
  • 1,723
  • 10
  • 14
  • Looks like an elegant answer. Time for me to learn about lapply I guess. – ixodid May 29 '18 at 20:55
  • there is also the purrr package with lapply-like functions...I find it a bit easier to use actually, but prefer to reduce number of dependencies in SO answers – sebdalgarno May 30 '18 at 16:35
  • If you're interested, I'd be interested in seeing the purrrr way of doing things as a teaching example. – ixodid May 31 '18 at 20:36
  • see edits. It's almost identical but no need for unlist, and ability to use ~ ., instead of function(x) x, which I like. – sebdalgarno May 31 '18 at 23:00