6

My database is suddenly hanging on some queries that join many tables and I see:

temporary file size exceeds temp_file_limit (1025563kB)

A sample query and query plan can be seen here:

https://pastebin.com/wug86Wrr

http://www.sharecsv.com/s/a6333a7f610e049dd81ebcfc19a4c02f/temp_file_limit_exceeded.csv

This query normally takes less than 100ms but hangs when hit temporary file size limit.

When I run:

SELECT temp_files AS "Temporary files", temp_bytes AS "Size of temporary files"
FROM pg_stat_database db;

I see:

Temporary files  Size of temporary files
---
22550            10651900248

How do I resolve this?


I was able to enable more of the problem. Say a user lists they can speak 2 languages - the query runs ok. Then they edit their profile to say they know 20 languages. The query ends up exceeding the temp_file_limit and hanging.

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
atkayla
  • 225
  • 1
  • 2
  • 7

2 Answers2

4

You are joining around 40 tables to retrieve around 200 columns. I did not dig in but, typically, such a query is a misunderstanding. Either way, no wonder you amass a lot of temporary files.

Also note that all temp files of the same session count against the limit. Bound cursors can become a problem that way. Or multiple large queries in the same transaction.

By default the parameter temp_file_limit is set to -1, which means "no limit". So either sanitize your query or remove or extend the limit. (Only superusers can do that.)

If you cannot change either, setting more work_mem might help to keep more operations in RAM, so less temp space is needed.

Don't confuse this setting with temp_buffers, which is mostly unrelated and sets the amount of RAM that can be used for temporary objects (like in CREATE TEMP TABLE ...)

And you did not bother to mention your version of Postgres. Maybe outdated?

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • You are joining around 40 tables to retrieve around 200 columns. I did not dig in but, typically, such a query is a misunderstanding. Yes, I am! The use case is a detailed user profile, where most fields many-to-many to lookup tables. I am not sure how to optimize that query, so I will create a separate question for that if that's the issue. But increasing my temp_file_limit helped (on Google Cloud SQL it was capped). work_mem was at 4GB and I increased to 64GB, but didn't notice much difference here. Google Cloud SQL is on Postgres 9.6. – atkayla Mar 24 '19 at 14:00
  • https://dba.stackexchange.com/questions/232952/joining-around-40-normalized-tables-for-200-columns-to-display-detailed-user-pro – atkayla Mar 24 '19 at 14:42
  • Setting work_mem too high can have adverse effects. It very much depends on available RAM, number of concurrent sessions and the complete workload, but 4 GB is already an exceptionally high setting and 64 GB seems way out of proportion except for specialized warehouse systems. – Erwin Brandstetter Mar 24 '19 at 23:04
  • Apologies! Meant 4MB -> 16MB -> 64MB work_mem on a 3.75GB RAM database. :P – atkayla Mar 24 '19 at 23:23
  • OK; that makes more sense. 4MB is almost always to low. But OTOH, 3.75GB RAM is a very smallish setup. My humble smartphone has more RAM ... – Erwin Brandstetter Mar 24 '19 at 23:56
  • This temp_file_limit showed me that the real problem might just be that my schema is just poor, specifically all of those surrogate key joins to lookup tables, when I've always heard that surrogate keys are recommended and that relational databases are good at joins. I'm in the process of refactoring to remove those surrogate keys and instead using natural keys as per https://dba.stackexchange.com/a/232955/133732, therefore eliminating those 40 joins. I really hope that will help this situation. – atkayla Mar 25 '19 at 00:19
  • I doubt that surrogate keys are the problem. RDBMS are good at joins. Even a hundred joins to small lookup tables wouldn't cause the problem you reported. There must be more to it. You may be barking up the wrong tree. The decision should be based on reason. Do some testing before you start a big migration. Both natural and surrogate keys have some advantages and some disadvantages. There are enum types, too - sometimes a reasonable middle ground. And consider upgrading to Postgres 11 if possible. – Erwin Brandstetter Mar 25 '19 at 01:44
  • 1
    I was afraid to hear that. My schema is pretty simple like you stated, just joins to a bunch of small lookup tables to build a full profile. Yes, I would imagine PostgreSQL should be able to handle this. Google Cloud SQL is stuck at 9.6, so I just spun up PostgresSQL 11 on aiven.io to test that out. Do you know of any particular changes that might affect this and performance from 9.6 to 11, or are you saying it's just good in general to upgrade? Anyway, I will test out PostgreSQL 11 and the natural keys to see if there's any significant difference. I appreciate it. – atkayla Mar 25 '19 at 02:06
  • 1
    Nothing in particular, and I have only superficially looked at your big query. There have been many improvements especially for big data. – Erwin Brandstetter Mar 25 '19 at 02:32
0

Maybe statistics are out of date, now causing the same query plan to run with a much worse query plan?

https://stackoverflow.com/questions/6903938/how-do-i-know-if-the-statistics-of-a-postgres-table-are-up-to-date

Rekonn
  • 53
  • 1
  • 7