0

I had a shapefile that I needed to integrate into my SQL Server DB. To do so, I used the very useful Shape2Sql.exe with which I chose 'Set SRID 4326' in the settings.

I now have a table that contains a geometry column while another table is showing to me the latitude and longitude of different addresses along with other useful information. This can be summed up as follows:

CREATE TABLE dbo.myTable1 (
       FULL_ADDRESS varchar(120) NULL,
       LONGITUDE varchar(50) NULL,
       LATITUDE varchar(50) NULL
                           )

CREATE TABLE dbo.myTable2 (
       POLYGON_NAME nvarchar(255) NULL,
       geom geometry NULL
                           )

What I would like is to join the two tables based on the sets of coordinates that match a certain polygon. I took some inspiration from here and I basically did this:

SELECT DISTINCT
      mt1.FULL_ADDRESS
    , mt1.LONGITUDE
    , mt1.LATITUDE
    , mt2.POLYGON_NAME
INTO #TEMP_TABLE
FROM myTable1 mt1
LEFT JOIN myTable2 mt2 
  ON geom.STIntersects(geometry::STGeomFromWKB((geography::Point(mt2.LATITUDE, mt2.LONGITUDE, 4326)).STAsBinary(), 4326)) = 1

(the DISTINCT is useful because, for some reason that I can't explain yet, I get several matches for a single set)

The results thereof are exactly what I expect but, for 10,000 sets of coordinates, it almost took 3 hours. Problem is, I'm supposed to find the polygons of 25,000,000 sets (which means 300 days of computation...). I therefore need to find a trick to improve that query performance by several orders of magnitude.

BONUS QUESTION:

When I don't store the results of my query inside a temp table (i.e. when I want to see the results), I get the following error message:

A .NET Framework error occurred during execution of user-defined routine or aggregate "No extended information available.": .

But everything is fine when using a SELECT... INTO #TEMP_TABLE instead of a plain and simple SELECT. I couldn't find any piece of information about this message. Any idea to solve it?

Guillaume
  • 63
  • 2
  • 11

1 Answers1

2

The best way to increase spatial join is to create spatial index.

CREATE INDEX table_gist ON table USING GIST(geom);

Since your first table has coordinates but no geometry you can't create a spatial index. That's why it takes so much time.

You should consider adding a geometry column on your first table based on your coordinate.

ALTER TABLE mytable1
ADD COLUMN geom geometry;

UPDATE mytable1
SET geom=ST_MakePoint(longitude, latitude);

Then you can create an index on mytable1. Your ST_Intersects between your two geometries will be much faster.

PS : I haven't talk about geography/geometry and srid but you should handle it :)

Busu
  • 136
  • 6
  • I'm using SQL Server, which means I wrote the following: ALTER TABLE mytable1 ADD geom geometry; UPDATE mytable1 SET geom=geometry::STGeomFromWKB((geography::Point(LATITUDE, LONGITUDE, 4326)).STAsBinary(), 4326); CREATE NONCLUSTERED INDEX table_gist ON mytable1(geom);. But I got the following message: Column 'geom' in table 'mytable1' is of a type that is invalid for use as a key column in an index or statistics. – Guillaume Oct 19 '18 at 09:04
  • Obviously on SQL server spatial index are created as following :
    CREATE SPATIAL INDEX table_index ON table(geom);
    
    – Busu Oct 19 '18 at 10:03
  • Yep sorry, I didn't know spatial indexes were a thing in SQL Server. I ended up writing CREATE SPATIAL INDEX table_gist ON myTable1(geom) WITH (BOUNDING_BOX = (minLong, minLat, maxLong, maxLat)) ON [PRIMARY]; and the same query took 2h30 instead of the initial 3h. That's an improvement but not quite sufficient to be fully satisfactory :( – Guillaume Oct 22 '18 at 13:25
  • Have you created an index on both tables ? – Busu Oct 23 '18 at 09:03
  • I haven't. I made a very quick test and it looks way better! I'll test it with a larger subset and will let you know whether it is OK or not. – Guillaume Oct 26 '18 at 14:31
  • 1
    I can confirm the perf was improved by several orders of magnitude, which is exactly what I wanted. Thank you so much! – Guillaume Oct 27 '18 at 13:19