How to use index for simple time range join?

Question

I have a large (~ 100 million rows) timeseries table t_16 in Postgres 11.5 where the primary key is a field abs_date_time of type timestamp.

This is a follow up of this question:

Why does a CTE cause a full table scan?

Initially I thought it was related to a CTE. But this query is slow, even without CTE.

How can I make the following query use the primary key index, to avoid a full table scan?

tsrange

This query takes ~20sec on my dev PC:

SELECT t_16_gen.*
FROM t_16_gen,
     (VALUES (tsrange('["2019-11-26 12:00:00","2019-11-26 12:00:15")'))
           , (tsrange('["2019-11-26 13:00:00","2019-11-26 13:00:15")'))) as ranges (time_range)
WHERE (abs_date_time >= LOWER(ranges.time_range)
    AND abs_date_time <  UPPER(ranges.time_range));

Explain plan:

Gather  (cost=1000.00..6185287.15 rows=20571433 width=80)
  Workers Planned: 2
  ->  Nested Loop  (cost=0.00..4127143.85 rows=8571430 width=80)
        Join Filter: ((t_16_gen.abs_date_time >= lower("*VALUES*".column1)) AND (t_16_gen.abs_date_time < upper("*VALUES*".column1)))
        ->  Parallel Seq Scan on t_16_gen  (cost=0.00..1620000.38 rows=38571438 width=80)
        ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=32)

In production the set of tsranges comes from a UDF - but there will always be only a few ranges (<200) and each range will have less than 1500 rows and the ranges will not overlap.

Simple timestamps instead of `tsrange`

Whenwe use timestamps directly (i.e. not using tsrange, LOWER() and UPPER()), the query is already faster. This query takes ~7sec on my dev PC:

SELECT t_16_gen.*
FROM t_16_gen,
     (VALUES ('2019-11-26 12:00:00'::timestamp,'2019-11-26 12:00:15'::timestamp)
           , ('2019-11-26 13:00:00','2019-11-26 13:00:15')) as ranges (start_incl, end_excl)
WHERE (abs_date_time >= ranges.start_incl
    AND abs_date_time <  ranges.end_excl);

Explain plan:

Nested Loop  (cost=0.00..5400001.28 rows=20571433 width=80)
  Join Filter: ((t_16_gen.abs_date_time >= "*VALUES*".column1) AND (t_16_gen.abs_date_time < "*VALUES*".column2))
  ->  Seq Scan on t_16_gen  (cost=0.00..2160000.50 rows=92571450 width=80)
  ->  Materialize  (cost=0.00..0.04 rows=2 width=16)
        ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=16)

`OR` conditions = FAST

When I rewrite the query to use OR conditions it is fast. This query takes ~200ms on my dev PC:

SELECT t_16_gen.*
FROM t_16_gen
WHERE (abs_date_time >= '2019-11-26 12:00:00' AND abs_date_time < '2019-11-26 12:00:15')
   OR (abs_date_time >= '2019-11-26 13:00:00' AND abs_date_time < '2019-11-26 13:00:15');

Explain plan:

Gather  (cost=13326.98..1533350.92 rows=923400 width=80)
  Workers Planned: 2
  ->  Parallel Bitmap Heap Scan on t_16_gen  (cost=12326.98..1440010.92 rows=384750 width=80)
        Recheck Cond: (((abs_date_time >= '2019-11-26 12:00:00'::timestamp without time zone) AND (abs_date_time < '2019-11-26 12:00:15'::timestamp without time zone)) OR ((abs_date_time >= '2019-11-26 13:00:00'::timestamp without time zone) AND (abs_date_time < '2019-11-26 13:00:15'::timestamp without time zone)))
        ->  BitmapOr  (cost=12326.98..12326.98 rows=925714 width=0)
              ->  Bitmap Index Scan on t_16_pkey  (cost=0.00..5932.64 rows=462857 width=0)
                    Index Cond: ((abs_date_time >= '2019-11-26 12:00:00'::timestamp without time zone) AND (abs_date_time < '2019-11-26 12:00:15'::timestamp without time zone))
              ->  Bitmap Index Scan on t_16_pkey  (cost=0.00..5932.64 rows=462857 width=0)
                    Index Cond: ((abs_date_time >= '2019-11-26 13:00:00'::timestamp without time zone) AND (abs_date_time < '2019-11-26 13:00:15'::timestamp without time zone))

`UNION` = FAST

When I rewrite the query to use UNION conditions it is also fast. This query takes ~220ms on my dev PC:

SELECT t_16_gen.*
FROM t_16_gen
WHERE (abs_date_time >= '2019-11-26 12:00:00' AND abs_date_time < '2019-11-26 12:00:15')
UNION
SELECT t_16_gen.*
FROM t_16_gen
WHERE (abs_date_time >= '2019-11-26 13:00:00' AND abs_date_time < '2019-11-26 13:00:15');

Explain plan:

Unique  (cost=1032439.64..1069468.20 rows=925714 width=80)
  ->  Sort  (cost=1032439.64..1034753.93 rows=925714 width=80)
"        Sort Key: t_16_gen.abs_date_time, t_16_gen.c_422, t_16_gen.c_423, t_16_gen.c_424, t_16_gen.c_425, t_16_gen.c_426, t_16_gen.c_427, t_16_gen.c_428, t_16_gen.c_429, t_16_gen.c_430, t_16_gen.c_431, t_16_gen.c_432, t_16_gen.c_433, t_16_gen.c_434, t_16_gen.c_435"
        ->  Append  (cost=0.57..892513.13 rows=925714 width=80)
              ->  Index Scan using t_16_pkey on t_16_gen  (cost=0.57..439313.71 rows=462857 width=80)
                    Index Cond: ((abs_date_time >= '2019-11-26 12:00:00'::timestamp without time zone) AND (abs_date_time < '2019-11-26 12:00:15'::timestamp without time zone))
              ->  Index Scan using t_16_pkey on t_16_gen t_16_gen_1  (cost=0.57..439313.71 rows=462857 width=80)
                    Index Cond: ((abs_date_time >= '2019-11-26 13:00:00'::timestamp without time zone) AND (abs_date_time < '2019-11-26 13:00:15'::timestamp without time zone))

Reproducing the issue

To reproduce the issue, I can create a new table and fill it with dummy data. Then restart the database before each test, so that the data is not cached.
Note: the insert query may run for several minutes!

create table if not exists t_16_gen (
    abs_date_time timestamp constraint t_16_pkey primary key,
    c_422 bigint,
    c_423 bigint,
    c_424 real,
    c_425 real,
    c_426 real,
    c_427 real,
    c_428 real,
    c_429 real,
    c_430 bigint,
    c_431 real,
    c_432 real,
    c_433 real,
    c_434 bigint,
    c_435 real
);

INSERT INTO t_16_gen
SELECT ts, 1,2,3,4,5,6,7,8,9,10,11,12,13,14
FROM (SELECT generate_series('2019-11-26'::timestamp, '2019-11-27', '1 millisecond') as ts) as gs;

@SQLRaptor unfortunately not: the start/ends can be arbitrary — TmTron, Nov 27 '19 at 18:13
When I implement my own t_16, it uses the index as expected. Please include a full test case, showing us the table creation, population, index creation, etc. (and which still reproduces the issue). — jjanes, Nov 27 '19 at 18:16
.. and please start by disclosing the Postgres version in use. Always. — Erwin Brandstetter, Nov 27 '19 at 20:23
@SQLRaptor Thanks for the good points. 2. was by accident, I've updated the queries. "Why do you cross join the filter values instead of parameters?": because this query is part of a larger query in production: so the time-ranges are not a direct input from the application, but an output from a former UDF — TmTron, Nov 28 '19 at 07:35
@jjanes Good point. I've added the statementes to create a similar table and add dummy data. — TmTron, Nov 28 '19 at 08:14
With version 11, it only does the seq scan if t_16_gen has never been vacuum analyzed. Once it has been analyzed (and so knows that the there is a strong correlation between the table order and the primary key order) then it strongly prefers the plan you want. — jjanes, Nov 30 '19 at 19:35
Your added demo table does not completely fit the rest of the info. 10 min intervals would capture 600,000 rows, while you stated that each duration is guaranteed to have less than 1500 rows. — Erwin Brandstetter, Dec 15 '19 at 03:31
You are right. I've added the test table in an update and forgot to change the related queries. Now, I've updated all the queries and explain plans in the question to be consistent. — TmTron, Dec 15 '19 at 09:25

Erwin Brandstetter · Accepted Answer · 2019-12-15T13:32:39.233

Your last (fast) query has two identical WHERE conditions, which Postgres is able to identify and fold to one. Hence the simpler plan with just a single index condition.

It gets more expensive with multiple different conditions. But Postgres still keeps operating based on estimates for actual input values. Try with one or more large intervals in the WHERE clause including most or all of the table and you will see a sequential scan instead.

That's different in principle for your first two queries based on a VALUES expression. There, Postgres forks two cases:

For one input row Postgres looks at actual values and produces the same plan as for your third query with a single WHERE condition, with estimates based on actual input values. You get index / bitmap index / sequential scan accordingly.
For more than one input row, Postgres stops looking at individual values, and prepares a query plan based on generic estimates and the actual number of input rows. You can provide a VALUES expression with 5 rows resulting in no result at all or 5 rows returning the whole table, it will be the same query plan.

Tested in Postgres 11.

Also be aware that joining to a set (the VALUES expression) is logically different from adding multiple OR'ed range predicates. Rows matching multiple time ranges in the set are returned multiple times, while the second form only ever returns a single instance, even if that matches multiple predicates.

So the second form with many OR's naturally favors bitmap index scans, which folds multiple hits into one automatically. Postgres has no idea that your conditions never overlap. (Or will they?? Then you have a bigger problem.) If data in your table is physically sorted by time (matching your PK column abs_date_time), then this should still work in your favor by accident.

But since your rows are rather wide (fewer tuples per page), and if you have many time ranges (up to 200?), unduly favoring bitmap index scans might be a disadvantage after all and simple index scans might be faster.

Solution

UNION ALL should be superior for you!

SELECT * FROM s_28.t_16 WHERE abs_date_time >= '2019-11-26 12:00:00' AND abs_date_time < '2019-11-26 12:10:00'
UNION ALL
SELECT * FROM s_28.t_16 WHERE abs_date_time >= '2019-11-26 13:00:00' AND abs_date_time < '2019-11-26 13:10:00'
-- add (many) more
;

First of all, it's the best match for the logic at work here. Chances are much better that future versions of Postgres will keep using good query plans.

This way, Postgres uses estimates based on actual input for each SELECT - and given your specs (all ranges are tiny) the query should never degrade to sequential scans, as long as your table statistics are not completely misleading.

And index scans are not at an (unfair) disadvantage against bitmap index scans any more.

Thanks for the great explanation.So in my case, I think I should split the query in 2 parts: First calculate the time ranges, then - in app-code - build a new query with multiple OR conditions in the WHERE clause. We should typically have less than 10 conditions (200 in the worst case). And all durations will ever only span a small fraction of the table (each duration is guaranteed to have less than 1500 rows). — TmTron, Nov 28 '19 at 08:22
@TmTron: I assume you are aware that both queries return different results for overlapping input ranges. Thinking of it some more, a better solution comes to mind. Consider the update. — Erwin Brandstetter, Dec 15 '19 at 04:30

How to use index for simple time range join?

tsrange

Simple timestamps instead of `tsrange`

`OR` conditions = FAST

`UNION` = FAST

Reproducing the issue

1 Answers1

Solution

Linked

Related

How to use index for simple time range join?

tsrange

Simple timestamps instead of tsrange

OR conditions = FAST

UNION = FAST

Reproducing the issue

1 Answers1

Solution

Linked

Related

Simple timestamps instead of `tsrange`

`OR` conditions = FAST

`UNION` = FAST