I am facing some curious cases in my postgres 9.6, mainly for queries that follow the format:
SELECT a.*
FROM a
INNER JOIN b ON a.id = b.a_id
WHERE
a.field1 = '' AND
b.field2 > ..
ORDER BY a.timestamp DESC
LIMIT 10
The explain analyze on this query returns as:
Limit (cost=0.86..13968.24 rows=10 width=24) (actual time=14933.751..14933.792 rows=10 loops=1)
-> Nested Loop (cost=0.86..6309063.89 rows=4517 width=24) (actual time=14933.751..14933.791 rows=10 loops=1)
-> Index Scan Backward using idx_timestamp on stac (cost=0.43..5229344.35 rows=988799 width=24) (actual time=0.007..2885.512 rows=3535779 loops=1)
Filter: ((field2 > '2020-01-01 18:59:59.999999-05'::timestamp with time zone) AND (field1 = 4))
Rows Removed by Filter: 168
-> Index Scan using id_idx on scene_data (cost=0.43..1.08 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=3535779)
Index Cond: (a.id = b.a_id)
Filter: ((another_field >= '14000'::double precision) AND (another_field = 'Some string'::text))
Rows Removed by Filter: 1
Planning time: 0.322 ms
Execution time: 14933.836 ms
3 ways this query became significantly faster:
- Fencing (edited)
SELECT * FROM (SELECT a.*
FROM a
INNER JOIN b ON a.id = b.a_id
WHERE
b.field_a = '...'
AND b.field_b >= 14000
AND a.field_a = 4
AND a.field_b > '2020-01-01T23:59:59.999999Z'
ORDER BY a.timestamp DESC) a
ORDER BY a.timestamp DESC LIMIT 10
(Edited)
Explain Analyze
Limit (cost=300441.32..300442.57 rows=10 width=353) (actual time=323.171..325.616 rows=10 loops=1)
-> Gather Merge (cost=300441.32..300731.00 rows=2519 width=353) (actual time=323.169..325.610 rows=10 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Sort (cost=299441.31..299447.60 rows=2519 width=353) (actual time=314.241..314.244 rows=10 loops=2)
Sort Key: table_a.timestamp DESC
Sort Method: top-N heapsort Memory: 27kB
Worker 0: Sort Method: top-N heapsort Memory: 35kB
-> Nested Loop (cost=0.99..299299.00 rows=2519 width=353) (actual time=0.117..297.406 rows=41558 loops=2)
-> Parallel Index Scan using idx_table_b_field_a on table_b (cost=0.43..28730.77 rows=37301 width=16) (actual time=0.067..24.940 rows=41558 loops=2)
Index Cond: (b.field_a = '....'::text)
Filter: (b.field_b >= '14000'::double precision)
Rows Removed by Filter: 174
-> Index Scan using table_a_pkey on table_a (cost=0.56..7.25 rows=1 width=353) (actual time=0.006..0.006 rows=1 loops=83116)
Index Cond: (table_a.id = table_b.a_id)
Filter: ((a.field_b > '2020-01-01 18:59:59.999999-05'::timestamp with time zone) AND (a.field_a = 4))
Planning Time: 0.728 ms
Execution Time: 325.703 ms
- Adding a second field
SELECT a.*
FROM a
INNER JOIN b ON a.id = b.a_id
WHERE
a.field1 = '' AND
b.field2 > ..
ORDER BY a.timestamp, a.id DESC LIMIT 10
Increasing limit to 500, which produces a different sort method/key such as
Limit (cost=511622.95..511624.20 rows=500 width=24) (actual time=385.317..385.383 rows=500 loops=1) -> Sort (cost=511622.95..511634.24 rows=4517 width=24) (actual time=385.315..385.348 rows=500 loops=1) Sort Key: a.timestamp DESC Sort Method: top-N heapsort Memory: 64kB -> Nested Loop (cost=1.12..511397.87 rows=4517 width=24) (actual time=0.028..374.960 rows=83116 loops=1) -> Index Scan using index_1 on b (cost=0.56..37701.50 rows=68004 width=16) (actual time=0.019..35.351 rows=83116 loops=1) Index Cond: (field_1 = 'some string'::text) Filter: (field_3 >= '14000'::double precision) Rows Removed by Filter: 349 -> Index Scan using a_pkey on stac (cost=0.56..6.96 rows=1 width=24) (actual time=0.004..0.004 rows=1 loops=83116) Index Cond: (a.id = b.a_id) Filter: (( > '2020-01-01 18:59:59.999999-05'::timestamp with time zone) AND (some_fild = 4)) Planning time: 0.382 ms Execution time: 385.440 ms
But why? How, to some extent, forcing "more complexity" into my query, makes the query planer decide "oh let's use heap sort", and they become quicker? I would love to understand why these generate faster queries before implementing them. My team and I also intend to migrate to newer versions but we're uncertain where to focus our efforts first to have concrete and understandable results.
Edit:
select
*
FROM a
INNER JOIN b
ON b.a_id = a.id --> This is a 1:1 Relationship
WHERE b.field_a = '...' --> ~83465 rows matching in b
AND b.field_b >= 14000 --> ~2938319 rows matching in b
AND a.field_a = 4 --> ~3868810 rows matching in a
AND a.field_b > '2020-01-01T23:59:59.999999Z' --> ~3818223 rows matching in a
ORDER BY observed desc LIMIT 10 OFFSET 0
I noticed a significant loss of performance when running simple counts for field_a and field_b when joining the two tables
select count(*) from a where a.field_a = 4 [0.5s] and
select count(*) from a inner join b on a.id = b.a_id where a.field_a = 4 [15s]
Also notice that table_a.id and table_b.a_id (FK) are UUID
SELECT a.*after joining tob, which can multiply rows. Is that your intention to get duplicate rows forma? I suspect the query is not exactly what you want to begin with. Also, depending on your other filters (which are constant? which are variable? how? how selective are they?) there may be much faster alternatives. – Erwin Brandstetter Apr 01 '22 at 15:40