Efficient predictable ordering in PostgreSQL

Question

I have a large table of locations. I would like to efficient paginate though the table. I had previously being using an OFFSET approach however the size of the table made that approach unusable. So I am now trying a cursor approach using the location id.

In order to ensure consistent ordering for cases where a user has 2 rows with identical timestamp, I am also ordering by id.

SELECT  *
FROM locations
WHERE
  user_id = 1
ORDER BY timestamp desc, id
LIMIT 100;

However after adding id to the ORDER BY, the query has become slow. It is now doing a Seq Scan which takes ~20 seconds.

QUERY PLAN
Limit  (cost=502534.86..502535.11 rows=100 width=152) (actual 
time=22822.113..22822.142 rows=100 loops=1)
  ->  Sort  (cost=502534.86..515512.80 rows=5191175 width=152) (actual time=22822.110..22822.133 rows=100 loops=1)
        Sort Key: ""timestamp"" DESC, id"
        Sort Method: top-N heapsort  Memory: 51kB
    ->  Seq Scan on locations  (cost=0.00..304131.89 rows=5191175 width=152) (actual time=1.603..21284.908 rows=5169237 loops=1)
          Filter: (user_id = 1)
          Rows Removed by Filter: 3048468
Planning time: 0.204 ms
Execution time: 22822.194 ms

Timestamp collisions are edge cases and id is a primary key. So why does the execution plan require a Seq Scan?

For context

SELECT indexdef
FROM pg_indexes
WHERE tablename = 'locations'

results

CREATE UNIQUE INDEX locations_pkey ON locations USING btree (id)
CREATE INDEX index_locations_on_user_id_and_timestamp ON locations USING btree (user_id, "timestamp")
CREATE INDEX index_locations_on_user_id_and_point ON locations USING gist (user_id, point)
CREATE INDEX index_locations_on_user_id ON locations USING btree (user_id)
CREATE INDEX index_locations_on_user_id_and_timestamp ON locations USING btree (user_id, "timestamp")
CREATE INDEX index_locations_on_user_id_and_timestamp_and_id ON locations USING btree (user_id, "timestamp", id)

Care to add the table and index definitions? You're probably missing an appropriate index. And do you really need SELECT *? — mustaccio, Nov 17 '18 at 15:33
Thanks @mustaccio, I added the table indexes for context. And yes, the actual query only retrieves the columns needed. It's fast with ORDER BY timestamp desc but super slow with ORDER BY timestamp desc, id as is requires a Seq Scan. I'm not sure why it requires a Seq Scan when id is a primary key. I even tried adding an extra index which includes (user_id, timestamp, id). Any ideas would be appreciated — Gregology, Nov 17 '18 at 22:54
Can you show the EXPLAIN of the fast version of the query, with only the ORDER BY timestamp desc? I assume that one is using one of your indexes which the slow query is not, not sure exactly why yet. — AdamKG, Nov 17 '18 at 23:03
Oh, I just recognized what your problem likely is. An index on (user_id asc, timestamp asc, id asc) (all are asc implicitly if not specified) can't handle a WHERE user_id=1 and then an ORDER BY timestamp desc, id; for that you need (user_id asc, timestamp desc, id asc). I'm doing some checking to confirm, but try creating the index with desc on the timestamp column. — AdamKG, Nov 17 '18 at 23:07
Just to clarify something about the use case, is it normal for a single user's id to match so much of the table? It appears to be a significant fraction of the table in the EXPLAIN, which is changing the query plan relative to what I'd normally expect. — AdamKG, Nov 17 '18 at 23:17
@AdamKG: You nailed it, I am pretty sure. Add an answer instead of just a comment. This related answer has detailed explanation: https://dba.stackexchange.com/a/39599/3684 — Erwin Brandstetter, Nov 18 '18 at 02:25
@AdamKG, spot on! It was the direction of the index. I added DESC to the id order and it finished in 43ms :D Thank you! — Gregology, Nov 18 '18 at 03:22

score 1 · Answer 1 · answered Nov 27 '18 at 03:52

1

Do you have INDEX(timestamp DESC, id ASC)?

Or, try ORDER BY timestamp DESC, id DESC .

answered Nov 27 '18 at 03:52

Rick James

78,038
5
47
113

Efficient predictable ordering in PostgreSQL

1 Answers1