21

Geohashes are being widely used in products like: Lucene, mongodb, etc and have become one of the most important technology of present day.

Have Geohashes replaced the good old R-trees or do R-trees have any advantages as compared to Geohashes?

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
Jannat Arora
  • 333
  • 2
  • 6
  • Here's an open source library for SQL Server which optimally exploits the Geohash as both integer (BIGINT) and string (VARCHAR). With the proper schema design and indexes, I have been able to maximally leverage the Geohash without having to resort to R-Trees, specialized GIS/Spatial engines, and expensive GIS/Spatial expertise. https://github.com/qalocate/qalgeohash-tsql – chaotic3quilibrium Mar 14 '21 at 21:42

1 Answers1

14

Geohash are very simple and effective way of indexing spatial features, particularly point features. Line and polygon features are little harder to index, but can be done. Geohash is a static hierarchical fixed size grid, overlayed on top of the earth surface. Grid cells of the same hierarchical level do not overlap. R-Tree is a dynamic grid which cell location and size change depending on the features they are indexing. R-Tree indexes features bounding boxes and cells change every time you insert and update data. Geohash is mostly used for indexing point features and cells do not change with every insert and update of data. Geohash cells do not adopt to the features like with R-tree.

Some of the advantages of geohash (comparing to r-tree) could be:

  • easy implementation
  • no performance degradation with growing number of features
  • proximity searches (partially true)

Some of the disadvantages of geohash (comparing to r-tree) could be:

  • arbitrary precision of grid
  • harder to index (and query) line and polygon features
  • size of the index could be large with some methods of line and polygon indexing
  • by the specifications, it can be only used with longitude/latitude coordinate system, although the same method could be applied to other coordinate systems also

Those products (databases) that you mentioned use geohash because geohash is mainly used for indexing points and there are lot of applications that need such a feature. Lines and polygons are not that often used (except for the GIS applications of course), so why bother with it. Other reason, is of course, ease of implementation. Geohash converts two-dimensional coordinate to one-dimensional value. This is called dimensional reduction. One-dimensional value is easy to indexed by standard b-tree which is mostly used in those products.

I have to mention that there are similar algorithms to geohash but most of them are proprietary and require licensing. Geohash is in public domain. This could be also the reason for such a large usage in the recent years.

There are probably some other advantages and disadvantages, but these are first that come to my mind. I hope my explanation helps a little bit.

Mario Miler
  • 3,365
  • 24
  • 25
  • i did not understand as to why do geohashes give arbitrary precision of grid. Can you please explain with an example. I'll be thankful to you for the same. – Jannat Arora Jul 29 '14 at 15:58
  • 1
    Geohash converts longitude and latitude coordinate into the one-dimensional string. Length of this string is directly tied to the converted precision of the coordinate. Please look at this http://unterbahn.com/2009/11/metric-dimensions-of-geohash-partitions-at-the-equator/. You can see how length of a geohash string is tied to the precision. Basically, geohash converts point into a polygon area (one geohash grid). The size of this polygon area is dependant of the length of the geohash string and what latitude you are calculating the geohash. – Mario Miler Aug 01 '14 at 06:25
  • I agree with your assessment. There are use-cases where each clearly has an advantage over the other. Because I wasn't able to choose an R-Tree solution, I used this open-source library for SQL Server which optimally exploits the Geohash as both integer (BIGINT) and string (VARCHAR). With the proper schema design and indexes, I have been able to maximally leverage the Geohash without having to resort to R-Trees, specialized GIS/Spatial engines, and expensive GIS/Spatial expertise. github.com/qalocate/qalgeohash-tsql – chaotic3quilibrium Mar 14 '21 at 21:53