12

We're using Postgres 9.2 on Windows to store low-frequency timeseries data: we're inserting around 2000 rows per second every second 24 hours, 7 days a week with no downtime. There is a DELETE that runs on the table every 10 minutes or so to keep the length of the table to a fixed number of days. This ends up being a fairly stable 900 million rows. (For those interested, SELECT, INSERT, DELETE are all performant).

As such the DELETE, whilst deleting rows is not freeing up disc space. For that we need VACUUM to run.

I've queries the pg_stat_user_tables and VACUUM appears not to have ever run.

What I understand from various docs (http://www.postgresql.org/docs/9.2/static/routine-vacuuming.html):

  • we appear to have auto-vacuum on, and it is running on other tables.
  • auto-vacuum doesn't run FULL, and shouldn't require an exclusive lock on the table.

Does anyone have any thoughts why auto-vacuum is not running? Is this purely because the table is continuously busy?

And it is worthwhile running VACUUM after every DELETE in this case (which runs every 10 minutes)?

Edit:

Query using the SQL from the SO link below:

-[ RECORD 2 ]---+---------------------------
schemaname      | stats
relname         | statistic_values_by_sec
last_vacuum     |
last_autovacuum |
n_tup           |    932,315,264
dead_tup        |    940,727,818
av_threshold    |    186,463,103
expect_av       | *

and raw output:

-[ RECORD 3 ]-----+---------------------------
relid             | 501908
schemaname        | stats
relname           | statistic_values_by_sec
seq_scan          | 12
seq_tup_read      | 4526762064
idx_scan          | 29643
idx_tup_fetch     | 2544206912
n_tup_ins         | 1573896877
n_tup_upd         | 0
n_tup_del         | 941176496
n_tup_hot_upd     | 0
n_live_tup        | 688858417
n_dead_tup        | 940727818
last_vacuum       |
last_autovacuum   |
last_analyze      |
last_autoanalyze  | 2014-08-09 01:36:21.703+01
vacuum_count      | 0
autovacuum_count  | 0
analyze_count     | 0
autoanalyze_count | 69
Barry
  • 223
  • 2
  • 6
  • 4
    See Aggressive Autovacuum on PostgreSQL. Also it would be interesting to have select * from pg_stat_user_tables for this table (use \x in psql for a nicely formatted output) – Daniel Vérité Aug 14 '14 at 09:00
  • 2
    That link is helpful, and perhaps answers the question - the table is too busy for auto-vacuum to work. @DanielVérité I've updated the question with the output you asked for. – Barry Aug 14 '14 at 12:04
  • 3
    That's a lot of dead tuples! If possible, consider partitioning the table by timestamp and dropping the old partitions instead of deleting. The major caveat being that unique index across partitions are not supported. – Daniel Vérité Aug 14 '14 at 12:16
  • 1
    Does the log file have messages about autovacuum on this table being cancelled? – jjanes Aug 19 '14 at 15:29
  • @jjanes No - there was no indication in the logs that autovacuum ever started. – Barry Aug 20 '14 at 07:18
  • Alas, there is never a log indication when autovacuum starts. You get a log entry only when it finishes (either successfully or unsuccessfully--assuming the failure happens in a way such the system is still alive to record the lack of success). This can be very frustrating for very long vacuum processes. You can look in pg_stat_activity to see of any vacuum is currently going on, and if so of what tables. – jjanes Aug 20 '14 at 07:29
  • @jjanes Just re-read your comment - no log of it being cancelled or finished either. I think the table was just too busy for the server to schedule it. – Barry Aug 20 '14 at 12:13
  • Simply being busy should not prevent the server from scheduling the table to be vacuumed, particularly if other tables are. You are sure someone didn't turn off autovac specifically for that table? You can try \d+ statistic_values_by_sec in psql to see if it shows Options: autovacuum_enabled=off – jjanes Aug 20 '14 at 20:51
  • @jjanes \d+ doesn't show any Options on that table. We have now replaced the one large table with lots of partitioned tables which has improved the performance, as well as eliminated the dead tuple problem (we just truncate the oldest table). – Barry Aug 21 '14 at 07:31

1 Answers1

2

I would look into partitioning. If partitioned by day, you could just drop the entire partition once it gets too old. You may even no longer have to vacuum.

Also, overall performance might increase, since you're not inserting where you're deleting. You would just need to write the code to create new partitions and delete old ones.

This is exactly what partitioning is for.

SQB
  • 465
  • 5
  • 21