PostgreSQL 9.6 - ORDER BY timestamp DESC LIMIT 10 is extremely slow even with index

Question

I am facing some curious cases in my postgres 9.6, mainly for queries that follow the format:

SELECT a.*
FROM a
INNER JOIN b ON a.id = b.a_id
WHERE 
a.field1 = '' AND
b.field2 > .. 
ORDER BY a.timestamp DESC 
LIMIT 10

The explain analyze on this query returns as:

Limit  (cost=0.86..13968.24 rows=10 width=24) (actual time=14933.751..14933.792 rows=10 loops=1)   
    ->  Nested Loop  (cost=0.86..6309063.89 rows=4517 width=24) (actual time=14933.751..14933.791 rows=10 loops=1)
        ->  Index Scan Backward using idx_timestamp on stac  (cost=0.43..5229344.35 rows=988799 width=24) (actual time=0.007..2885.512 rows=3535779 loops=1)
              Filter: ((field2 > '2020-01-01 18:59:59.999999-05'::timestamp with time zone) AND (field1 = 4))
              Rows Removed by Filter: 168
        ->  Index Scan using id_idx on scene_data  (cost=0.43..1.08 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=3535779)
              Index Cond: (a.id = b.a_id)
              Filter: ((another_field >= '14000'::double precision) AND (another_field = 'Some string'::text))
              Rows Removed by Filter: 1 
Planning time: 0.322 ms 
Execution time: 14933.836 ms

3 ways this query became significantly faster:

Fencing (edited)

SELECT * FROM (SELECT a.*
FROM a
INNER JOIN b ON a.id = b.a_id
WHERE 
b.field_a = '...' 
AND b.field_b >= 14000 
AND a.field_a = 4
AND a.field_b > '2020-01-01T23:59:59.999999Z'
ORDER BY a.timestamp DESC) a
ORDER BY a.timestamp DESC LIMIT 10

(Edited)

Explain Analyze
Limit  (cost=300441.32..300442.57 rows=10 width=353) (actual time=323.171..325.616 rows=10 loops=1)
  ->  Gather Merge  (cost=300441.32..300731.00 rows=2519 width=353) (actual time=323.169..325.610 rows=10 loops=1)
        Workers Planned: 1
        Workers Launched: 1
        ->  Sort  (cost=299441.31..299447.60 rows=2519 width=353) (actual time=314.241..314.244 rows=10 loops=2)
              Sort Key: table_a.timestamp DESC
              Sort Method: top-N heapsort  Memory: 27kB
              Worker 0:  Sort Method: top-N heapsort  Memory: 35kB
              ->  Nested Loop  (cost=0.99..299299.00 rows=2519 width=353) (actual time=0.117..297.406 rows=41558 loops=2)
                    ->  Parallel Index Scan using idx_table_b_field_a on table_b  (cost=0.43..28730.77 rows=37301 width=16) (actual time=0.067..24.940 rows=41558 loops=2)
                          Index Cond: (b.field_a = '....'::text)
                          Filter: (b.field_b >= '14000'::double precision)
                          Rows Removed by Filter: 174
                    ->  Index Scan using table_a_pkey on table_a  (cost=0.56..7.25 rows=1 width=353) (actual time=0.006..0.006 rows=1 loops=83116)
                          Index Cond: (table_a.id = table_b.a_id)
                          Filter: ((a.field_b > '2020-01-01 18:59:59.999999-05'::timestamp with time zone) AND (a.field_a = 4))
Planning Time: 0.728 ms
Execution Time: 325.703 ms

Adding a second field

SELECT a.*
FROM a
INNER JOIN b ON a.id = b.a_id
WHERE 
a.field1 = '' AND
b.field2 > .. 
ORDER BY a.timestamp, a.id DESC LIMIT 10

Increasing limit to 500, which produces a different sort method/key such as

Limit  (cost=511622.95..511624.20 rows=500 width=24) (actual time=385.317..385.383 rows=500 loops=1)   ->  Sort  (cost=511622.95..511634.24 rows=4517 width=24) (actual time=385.315..385.348 rows=500 loops=1)
        Sort Key: a.timestamp DESC
        Sort Method: top-N heapsort  Memory: 64kB
        ->  Nested Loop  (cost=1.12..511397.87 rows=4517 width=24) (actual time=0.028..374.960 rows=83116 loops=1)
              ->  Index Scan using index_1 on b  (cost=0.56..37701.50 rows=68004 width=16) (actual time=0.019..35.351 rows=83116 loops=1)
                    Index Cond: (field_1 = 'some string'::text)
                    Filter: (field_3 >= '14000'::double precision)
                    Rows Removed by Filter: 349
              ->  Index Scan using a_pkey on stac  (cost=0.56..6.96 rows=1 width=24) (actual time=0.004..0.004 rows=1 loops=83116)
                    Index Cond: (a.id = b.a_id)
                    Filter: (( > '2020-01-01 18:59:59.999999-05'::timestamp with time zone) AND (some_fild = 4)) Planning time: 0.382 ms Execution time: 385.440 ms

But why? How, to some extent, forcing "more complexity" into my query, makes the query planer decide "oh let's use heap sort", and they become quicker? I would love to understand why these generate faster queries before implementing them. My team and I also intend to migrate to newer versions but we're uncertain where to focus our efforts first to have concrete and understandable results.

Edit:

select
*
FROM a
INNER JOIN b 
ON b.a_id = a.id --> This is a 1:1 Relationship
WHERE b.field_a = '...' --> ~83465 rows matching in b
AND b.field_b >= 14000 --> ~2938319 rows matching in b
AND a.field_a = 4 --> ~3868810 rows matching in a
AND a.field_b > '2020-01-01T23:59:59.999999Z' --> ~3818223 rows matching in a
ORDER BY observed desc LIMIT 10 OFFSET 0

I noticed a significant loss of performance when running simple counts for field_a and field_b when joining the two tables select count(*) from a where a.field_a = 4 [0.5s] and select count(*) from a inner join b on a.id = b.a_id where a.field_a = 4 [15s]

Also notice that table_a.id and table_b.a_id (FK) are UUID

Related (and recommended read!): https://dba.stackexchange.com/q/249617/3684 — Erwin Brandstetter, Apr 01 '22 at 03:14
You SELECT a.* after joining to b, which can multiply rows. Is that your intention to get duplicate rows form a? I suspect the query is not exactly what you want to begin with. Also, depending on your other filters (which are constant? which are variable? how? how selective are they?) there may be much faster alternatives. — Erwin Brandstetter, Apr 01 '22 at 15:40
Hey Erwin, thanks for helping out. I've read your answer and even some others that seem to be relevant for this topic. I just edited the post with more detail. But here are some answer: a joins with b (it's a uuid join) but they have a 1:1 relationship. The filters aren't very selective individually. The entire subset generated (before limiting) will yield 80k rows. Which still, isn't very selective, but there are situations that we may want to paginate over all these 80k resources. — bobleujr, Apr 06 '22 at 15:16

score 4 · Answer 1 · answered Mar 31 '22 at 22:15

Simply put, the optimizer believes that for your current filters, if it just reads the table in the right order (a.timestamp DESC) and join from there, it won't take long to find 10 matching rows. In reality, it actually needed to loop through the top 3,535,779 rows before it makes 10 matches.

You haven't provided stats for (1) but an educated guess would be that changing the query to order in the opposite direction means that it finds matching rows a lot sooner. This would be the case if your other filters are more likely to be true for earlier timestamp values.

(2) same as (1) you're now ordering by timestamp (asc)

(3) increasing the number of rows you want it to find will linearly increase the amount of rows it would compute it needs to loop through in order to find enough matching rows. This increases the cost of this method above a certain threshold - the cost of using the filters to drive the query and sorting the results.

My advise would be to give the optimizer a better idea of how selective your other filters are. If they are related to the timestamp (as suggested by (2) and (3)) then this information is going to be very tricky for it to pick up.

Thank you for the answer Andrew. I've updated the question with an analyze for (1). One thing I also wanted to mentioned is that I forgot to add the DESC to the subquery. So both inner and outer queries had order by desc. So if 2) was true because of timestamp ASC, the original query without DESC should be able to run a lot faster, which isn't the case. Either order will produce slow results. Can you provide an idea on how to approach this problem from "If they are related to the timestamp (as suggested by (2) and (3)) then this information is going to be very tricky for it to pick up." — bobleujr, Apr 06 '22 at 17:01
CONTINUATION: It might just be lack of knowledge from my part but, what are some ways I can more clearly communicate to the optimizer my goal there. — bobleujr, Apr 06 '22 at 17:06

PostgreSQL 9.6 - ORDER BY timestamp DESC LIMIT 10 is extremely slow even with index

1 Answers1