5

I'm collecting POIs data from several sources and plotting them on map, i'm having difficulties merging the POIs because some of them are redundant between the data sources i'm using. How can i assign a unique identifier for each POI in a ways that the implementation logic can figure out the redundancy, e.g. I have a starbucks in two of the sources each for the same branch, and there is slight difference in the long/lat value due to accuracy, how can i automatically identify th redundancy.

Street names are not enough as they are not unique and might get updated from different sources, i was thinking of using the lat/long in addition to the first three letters of the POI, something like STA-31,444-30,122 but again the long/lat are not always the same between the different sources because of the accuracy, they might be off for a few meters.

Thanks Yehia

underdark
  • 84,148
  • 21
  • 231
  • 413
  • 1
    did you mean to say Street name is not enough instead of "Street name are enough as they are not unique"? – Brad Nesom Jun 01 '11 at 13:36
  • @Yenia A.Salam, what GIS software are you using? – artwork21 Jun 01 '11 at 13:44
  • intergraph geomedia, yea because the streetname might change over time from one source to another – Yehia A.Salam Jun 01 '11 at 15:35
  • You could create an ID based on the grid cell that each POI is within. The problem is that they could be in neighboring grid squares, but still quite close. Ultimately this is not an issue with the unique ID, it is a matter of identifying redundancy. Better to rephrase the question to consider identifying redundancy. – Matthew Snape Jun 02 '11 at 14:12

3 Answers3

1

Normally the address of a starbucks would be the same in any different source.
If it were me I would as you suggest calculate a field with the poi name, street num, and name all concatenated together.
Then you can do a dissolve on the user uid.
No need to use the lat lon.

Brad Nesom
  • 17,412
  • 2
  • 42
  • 68
1

The obvious solution appears to be to degrade the lat/long precision to a point that negates the difference in accuracy.

eg, if you have:

  • 123.123456 / 45.6789
  • 123.123478 / 45.6782

Then reducing them both to 3DP of precision gives the same result (123.123 / 45.678).

The trick is just to figure out how much to degrade them to match the same location, without accidentally merging in neighbouring POI's (5-10m would seem to me to be OK, but how that equates to decimal degrees I'm not sure).

Mark Ireland
  • 13,147
  • 3
  • 33
  • 67
1

I use SQL Server and i have for POI name = ABBREVATION of NAME (TIM HURTON= TH) + LAT (45.6-> 456) + LNG(107.9 ->1079).

But It's very different depending on the source. Good luck !