9

In some circumstances, I've been told not to do a VACUUM FULL (or CLUSTER) of a table that is in production, because that will exclusively lock it for longer than wished. The same might apply to several ALTER TABLE operations (like changing the type of several columns).

The alternative presented is always to do something like:

 CREATE TABLE new_table AS SELECT * FROM old_table ;
 -- recreate all indices and constraints
 ALTER TABLE old_table RENAME TO going_to_drop_table ;
 ALTER TABLE new_table RENAME TO old_table ;
 DROP TABLE going_to_drop_table ;

This would work in an scenario where there are no dependencies to old_table (meaning there are not any views that depend on it, nor any foreign key constraints, functions, etc.), and that old_table does not get any inserts or updates. But that would be an exception, not the rule, in most databases.

Is there any way to make such a 'table swap' without losing dependencies?

[For the sake of completeness: I'm specially interested in how to do it for PostgreSQL 9.5 or 9.6]


Research until now (about the underlying reason):

  1. Is it possible to run VACUUM FULL in Postgres asynchronously? => pg_repack. Caveats: might not be easy to implement on Windows, not tested with postgresql 9.6. Looks like the most promissing option.
  2. pg_reorg: similar to pg_repack (was the basis for it) => Seems not to have been updated since postgresql 9.4, and it's mostly superseded by pg_repack.
  3. Old bash-based tool from OmniTI, referenced by Reduce Bloat of table without long/exclusive Locks => looks not up-to-date.
  4. Cleaning Up PostgreSQL Bloat => (for what I'm interested in), basically recommends using vacuumdb using --jobs=njobs. Looks like a way to speed-up vacuums, and an actual possibility. I don't like the sentence from the documentation that states "Note that using this mode together with the -f (FULL) option might cause deadlock failures if certain system catalogs are processed in parallel."
  5. pgcompact from pgtoolkit => don't know if this is up-to-date. It's based on Reducing bloat without locking, which looks safe.
  6. Safe Operations For High Volume PostgreSQL: A good collection of workarounds to minimize table locking. Thanks to @hruske for the comment.

Which is the best way to avoid the exclusive locks associated with ALTER TABLE ALTER COLUMN ..., CLUSTER or VACUUM FULL?

joanolo
  • 13,397
  • 7
  • 36
  • 65
  • 5
    I have used pg_repack in the past successfully, but not on pg 9.6, yet. I expect it to work fin, though. Some bad news for the "alternative": 1. You need a write lock anyway or you might miss writes to the old table while in transition. 2. While the time frame is tiny, there is still a chance for queries to get lost in the transition. As soon as queries are parsed and planned table names are resolved to OIDs, so concurrent transactions might wait for old_table to be unlocked, only to find it deleted and raise an exception. – Erwin Brandstetter Jan 07 '17 at 16:14
  • 1
    pg_repack is the tool giving me "the best feeling" til now. But cannot do anything when need to alter table... – joanolo Jan 07 '17 at 16:19
  • The title is somewhat misleading as you didn't tell what you are trying to do by replacing tables - just reducing table bloat or is there another use case? – hruske Jan 07 '17 at 21:36
  • @hruske: Reduce table bloat means basically trying to get back to the OS all unused space on a table (see What is table bloating in databases). This happens during normal operation, specially after massive updates. But you also have table exclusive locks if you have one column c INTEGER and try to modify it with ALTER TABLE t ALTER COLUMN c TYPE BIGINT;. If you have the need to make such a modification with a (big) production database... you might face difficulties. – joanolo Jan 07 '17 at 21:40
  • I'm aware of what table bloat is, it's just that you are asking for a tool (how to replace a table), not for a goal (how to reduce table bloat). Here's a list of hints how to do operations on bigger tables -- https://www.braintreepayments.com/blog/safe-operations-for-high-volume-postgresql/ – hruske Jan 08 '17 at 10:08
  • @hruske: very illustrating, thx. I'm actually asking: "how to avoid or minimize (long) access exclusive locks into a table", and they mostly happen when you try to reduce table bloat, or when you alter your table. In retrospect, I might have to change the title of the question... – joanolo Jan 08 '17 at 11:56
  • 1
    I think it would be better to ask: why do you have that table bloat? Do you have long running transactions that prevent (auto)vacuum from cleaning up? Or even many sessions that are "idle in transaction"? Very often the solution is to make autovacuum more aggressive so that it can keep up with the update rate –  Jan 08 '17 at 12:09
  • It doesn't happen often, and autovacuum does normally a very nice job. But sometimes you need to alter table and add a column, or set a default value to the NULLs in one column, or update one column based on a certain computation, or anything that massively updates your table... and you easily end up with a table size twice the original one. If it is a table that will have lots of inserts later on, you might just decide that autovacuum is enough... but if this is not the case, it's a pity to throw out a massive amount of disk space. – joanolo Jan 08 '17 at 12:13
  • 1
    @joanolo I think you should add the second part of your question as the answer, pointing to pg_repack as a proven method. With ALTER TABLE, you don't have much to improve other than keeping heavy locks for the shortest time possible. – András Váczi Apr 10 '19 at 13:49

0 Answers0