2

I am executing this query ST_ClusterDBSCAN with PostGIS extension and PostgreSQL. I have indexed point_data but an EXPLAIN ANALYZE shows me that the index is not being picked up. How can I rearrange my query so that it is indeed picked up?

Index:

CREATE INDEX index_point_data ON point_data USING GiST(geom)

Query:

SELECT id, ST_ClusterDBSCAN(geom, eps := 350, minPoints := 3)
OVER () AS cluster_id, geom
FROM point_data

Plan:

WindowAgg  (cost=0.00..148.64 rows=3984 width=44) (actual time=195.029..197.227 rows=3984 loops=1)
  ->  Seq Scan on point_data  (cost=0.00..98.84 rows=3984 width=40) (actual time=0.022..0.578 rows=3984 loops=1)
Planning time: 0.136 ms
Execution time: 197.593 ms
audlift-bit
  • 233
  • 2
  • 7
  • 2
    Note that the slow part is the evaluation of the function, not the Seq Scan. The Seq Scan takes only half a millisecond (0.5ms) –  Apr 20 '17 at 05:38

1 Answers1

2

A query reading the whole table will not profit from the index. It's cheaper to read the whole table sequentially.

The spatial GiST index you display can be useful for other types of queries filtering a small percentage from the table, like:

SELECT * FROM point_data WHERE ST_DWithin($input_geometry, geom, 5000);

Understanding the decisions of the Postgres query planner requires intimate knowledge of Postgres. Start by reading this chapter of the manual and follow links from there:

And more particularly, about spatial indexes in the PostGIS manual:

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • 1
    Plus: for only 3984 rows, The index wouldn't probably used either. –  Apr 20 '17 at 05:37
  • Thanks both. Indeed it is not picked up for only 3984 rows but for something of a little larger scale it is. Can you please elaborate a bit on some indications on whether a spatial index will be picked up? I know it has to do with costs but seems like there are some even more obvious indications before that. – audlift-bit Apr 20 '17 at 11:11
  • @audlift-bit: For a query like I display, it already pays to use the index with 4k rows while only one or very few rows are retrieved. And I added a link for deeper understanding above. – Erwin Brandstetter Apr 20 '17 at 13:38
  • @ErwinBrandstetter the PostGIS link discussing how to take advantage of index is especially interesting to me but quite dry! Would you by any chance know if there is something providing a list of indexable (spatial) PostGIS functions? – audlift-bit Apr 26 '17 at 17:50
  • @audlift-bit: Whether indexes are applicable is decided by operators, not functions (related: http://stackoverflow.com/a/29245753/939860). Some functions use operators under the hood, like many PostGis functions. The PostGIS manual mentions index support for each function. Example: ST_DWithin – Erwin Brandstetter Apr 27 '17 at 02:33