I've got an events table with over 40 million rows. It has a btree index on type: "index_events_on_type" btree (type).
Any equality look up takes ages! I tried tuning up the work_mem to get rid of the lossy value but it had little to no effect.
dev=# EXPLAIN (ANALYZE, BUFFERS) SELECT "events".* FROM "events" WHERE "events"."type" = 'contact.card.added';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on events (cost=56176.20..1633783.13 rows=1614405 width=296) (actual time=252.846..13832.331 rows=1550851 loops=1)
Recheck Cond: ((type)::text = 'contact.card.added'::text)
Rows Removed by Index Recheck: 10414231
Heap Blocks: exact=39335 lossy=758338
Buffers: shared read=807020
-> Bitmap Index Scan on index_events_on_type (cost=0.00..55772.60 rows=1614405 width=0) (actual time=243.096..243.096 rows=1550851 loops=1)
Index Cond: ((type)::text = 'contact.card.added'::text)
Buffers: shared read=9347
Planning time: 0.100 ms
Execution time: 13924.612 ms
(10 rows)
...and the table
dev=# \d events
Table "public.events"
Column | Type | Collation | Nullable | Default
-------------------+-----------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('events_id_seq'::regclass)
type | character varying | | not null |
aggregate_root_id | uuid | | not null |
entity_id | uuid | | |
parent_id | bigint | | |
created_at | timestamp without time zone | | not null |
body | jsonb | | | '{}'::jsonb
Indexes:
"events_pkey" PRIMARY KEY, btree (id)
"index_events_on_aggregate_root_id" btree (aggregate_root_id)
"index_events_on_created_at" btree (created_at)
"index_events_on_entity_id" btree (entity_id)
"index_events_on_parent_id" btree (parent_id)
"index_events_on_type" btree (type)
Foreign-key constraints:
"fk_rails_68f023eb25" FOREIGN KEY (parent_id) REFERENCES events(id)
Referenced by:
TABLE "events" CONSTRAINT "fk_rails_68f023eb25" FOREIGN KEY (parent_id) REFERENCES events(id)
Any guidance would be appreciated.
Update
Looks like the most time spent is in I/O as the query returns 1.5M records:
dev=# EXPLAIN (ANALYZE, BUFFERS) SELECT "events"."type" FROM "events" WHERE "events"."type" = 'bank_account.payout.added';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on events (cost=6324.92..520827.82 rows=182239 width=32) (actual time=283.119..15719.744 rows=1550851 loops=1)
Recheck Cond: ((type)::text = 'contact.card.added'::text)
Rows Removed by Index Recheck: 10414212
Heap Blocks: exact=39336 lossy=758337
Buffers: shared hit=75 read=806945
I/O Timings: read=13783.176
-> Bitmap Index Scan on index_events_on_type (cost=0.00..6279.36 rows=182239 width=0) (actual time=272.302..272.302 rows=1550851 loops=1)
Index Cond: ((type)::text = 'contact.card.added'::text)
Buffers: shared hit=75 read=9272
I/O Timings: read=67.302
Planning time: 0.069 ms
Execution time: 15807.701 ms
(12 rows)
Rows returned:
dev=# SELECT COUNT(*) FROM "events" WHERE "events"."type" = 'contact.card.added';
count
---------
1550851
(1 row)
Total rows:
dev=# SELECT COUNT(*) FROM "events"
dev-# ;
count
----------
36447779
(1 row)
work_mem? – Apr 16 '20 at 07:021GB– EasyCo Apr 16 '20 at 07:04show work_mem;value? – EasyCo Apr 16 '20 at 07:04Recheck Condbut without stepsRows Removed by Index RecheckandHeap Blocks: exact=xxx lossy=yyy. – pifor Apr 16 '20 at 07:13Index Cond: ((type)::text = 'contact.card.added'::text): I just don't understand this because the btree can easily return key is equal or key is different. – pifor Apr 16 '20 at 07:20set random_page_cost = 1.0or maybe turning off the bitmap index scan usingset enable_bitmapscan = false– Apr 16 '20 at 07:25create index on events(id) where type = 'contact.card.added';– Apr 16 '20 at 07:27set track_io_timing = onfirst. – jjanes Apr 16 '20 at 14:22