We have a large query that is extra slow when customers "run it the first time, early in the morning..."
So, I found pg_prewarm that I would like to use load into PG's buffer cache a certain amount or most recently accessed rows (inserted, updated, or deleted) from several tables that are used in the said query.
Furthermore, I need to make sure that 'warm up' does not exceed PG's cache (shared_buffers setting I believe, or am I wrong?) To prewarm last 1000 pages of a single table I can do:
SELECT pg_prewarm(
'mytable',
-- "pre warm" last 1000 pages
first_block := (
SELECT pg_relation_size('mytable') / current_setting('block_size')::int4 - 1000
)
);
Question #1: Does this approach makes sense?
The trick is that pg_prewarm can only load a certain number of pages, so I need to calculate "how many live rows are in page for certain table"
-- show some settings
SELECT current_setting('block_size')::int4 AS page_size_bytes; -- 8192
SHOW shared_buffers; -- 512 MB
-- https://www.postgresql.org/docs/current/static/pgstattuple.html
--CREATE EXTENSION pgstattuple;
-- find out live row size and live rows per page
SELECT 'mytable'AS table_name, pg_size_pretty(tuple_len / tuple_count) AS live_row_size, 8192.00 / (tuple_len / tuple_count) AS live_rows_per_page, * FROM pgstattuple('public.mytable')
--"table_name","live_row_size","live_rows_per_page","table_len","tuple_count","tuple_len","tuple_percent","dead_tuple_count","dead_tuple_len","dead_tuple_percent","free_space","free_percent"
--"studies","1286 bytes",6.3701399688958009,652697600,462123,594431269,91.07,0,0,0,52329672,8.02
Question #2: Is my query above correct? Is this the proper way to get "live rows per page"? Question #3: the live_row_size I get from above query is not the same as the result mentioned in this answer (by Erwin). Am I doing something wrong?
Based on live_rows_per_page I can then modify pg_prewarm to load enough last XXXX pages that contain 10,000 rows (6.37 x 10,000)
SELECT pg_prewarm(
'mytable',
-- "pre warm" pages of the last 10,000 rows for 'mytable'
first_block := (
SELECT pg_relation_size('mytable') / current_setting('block_size')::int4 - 63700
)
);
Update #1
Regarding the Question #3, I get following when I run the query for mytable...
The output of pgstatuple is is different, probably because it is not listing the same items, but I am not sure...
"what","bytes/ct","bytes_pretty","bytes_per_row"
"core_relation_size",652697600,"622 MB",1412
"visibility_map",16384,"16 kB",0
"free_space_map",180224,"176 kB",0
"table_size_incl_toast",1101955072,"1051 MB",2384
"indexes_size",508289024,"485 MB",1099
"total_size_incl_toast_and_indexes",1610244096,"1536 MB",3484
"live_rows_in_text_representation",1138946462,"1086 MB",2464
"------------------------------",<NULL>,"<NULL>",<NULL>
"row_count",462123,"<NULL>",<NULL>
"live_tuples",0,"<NULL>",<NULL>
"dead_tuples",3,"<NULL>",<NULL>