11

I have a table which is storing reservation data using the columns starts_at & ends_at Whenever I'm querying the table to find overlapping reservations, I have an option of using one of the following queries:

SELECT * FROM reservations
WHERE starts_at < '2014-01-03 00:00:00'
AND   ends_at   >='2014-01-01 00:00:00';

Or

SELECT * FROM reservations
WHERE tsrange(starts_at, ends_at) && ('2014-01-01 00:00:00', '2014-01-03 00:00:00')

I have regular B-Tree indices on the starts_at and ends_at columns, therefore the first query is always using them. However, unless I define a functional GiST index on the tsrange, the second query does a complete scan.

create index tsrange_idx on reservations using gist(tsrange(starts_at, ends_at)); 

My question is, as the table grows, which index is going to be faster? Probably the answer is obvious looking at the query execution plan, but I'm not well-versed with reading EXPLAIN ANALYZE output.

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
Saurabh Nanda
  • 323
  • 1
  • 3
  • 15
  • http://dba.stackexchange.com/questions/39589/optimizing-queries-on-a-range-of-timestamps-two-columns To summarise both should work relatively well. It is hard to give a definitive answer without testing with real world data. – Zack Feb 18 '14 at 04:01
  • i am getting this error which creating index help is appreciated SQL Error [42883]: ERROR: function tsrange(timestamp with time zone, timestamp with time zone) does not exist Hint: No function matches the given name and argument types. You might need to add explicit type casts. Position: 52 – Rizwan Patel Jul 21 '22 at 16:52

1 Answers1

20

Timestamps with B-tree index

I suggest a third option: as long as your table holds two timestamp columns (which seem to be defined NOT NULL) I would use a single multicolumn B-tree index with opposed sort order (if no other considerations apply):

CREATE INDEX reservations_range_idx ON reservations (starts_at, ends_at DESC);

More in these related answer:

As for the query, have a look at the SQL-standard operator OVERLAPS:

SELECT * FROM reservations
WHERE (starts_at, ends_at) OVERLAPS ('2014-01-01 00:00:00', '2014-01-03 00:00:00');

More in this related question on SO:

Should be faster than two B-tree indices. Less disk space and cheaper to maintain. The burden on write operations is small.

Range type with GiST index

With big tables, a GiST index on a range type is most probably faster, because it scales better. Storage on disk is considerably bigger and index maintenance a bit more expensive, though.

If you go that route, it would be more efficient to store your timestamps as range (tsrange or tstzrange) to begin with. A plain GiST index without the functional aspect is a bit faster.

CREATE TABLE reservation (
  reservation_id serial PRIMARY KEY
, span tsrange
, ...
);

CREATE INDEX reservation_span_gist_idx on reservations USING GiST (span);

With the && "overlap" operator you already displayed in the question:

SELECT *
FROM   reservation 
WHERE  span && ('2014-01-01 00:00:00', '2014-01-03 00:00:00');

You might be interested in an exclusion constraint to rule out overlaps by design, which implements a GiST index like the one above automatically. There is a code example in the manual. This related answer on SO has more details:

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600