I have some tables in my Postgres database, and I have set the tables to cluster about an index. But I forget when was the last time I ran cluster maintenance on these tables, and I do not want to waste my time to run cluster on them. How do I know if the tables are clustered?
3 Answers
You can query all the clustered tables from the data dictionary:
SELECT relname AS table_name
FROM pg_class c
JOIN pg_index i ON i.indrelid = c.oid
WHERE relkind = 'r' AND relhasindex AND i.indisclustered
- 4,024
-
1This doesn't seem to include primary keys that are clustered? – jontejj Feb 24 '15 at 09:24
-
1And it doesn't say which index was used for the clustering, is there a way to add that? – chrismarx Mar 29 '17 at 15:01
-
1As best I can tell this answer is very wrong in the context of the question. It only queries what indexes are used for clustering - it gives no information on when or even if such clustering was performed. The better answer is to look at stats as given in: https://stackoverflow.com/questions/53299733/how-to-tell-when-a-postgres-table-was-clustered-and-what-indexes-were-used – Alex Stoddard May 07 '20 at 15:16
-
Based on the comments, which I have not verified, it seems this answer does not apply to all situations. So I removed the answer mark. – David S. May 07 '20 at 23:25
The accepted answer appears to be incorrect.
It seems that when (or even if) clustering was performed is not available from the query but stats can be queried to see how good clustering is: https://stackoverflow.com/questions/53299733/how-to-tell-when-a-postgres-table-was-clustered-and-what-indexes-were-used
- 121
Building off @alex-stoddard's answer and the StackOverflow question he linked, I was able to construct this query, which seems to show all index-table pairs used for clustering:
SELECT
indisclustered,
index_name,
cls.relname AS table_name
FROM
pg_class cls
INNER JOIN (
SELECT
indexrelid,
indrelid,
indisclustered,
relname AS index_name
FROM pg_index ndx
INNER JOIN pg_class cls
ON cls.oid=indexrelid
WHERE indisclustered='t'
) tmp
ON cls.oid=tmp.indrelid
;
Note that clustering is per-index, not per-column (although indices are often based on columns), but that performance stats are collected by column. If you also want to get some per-column stats, you can try:
SELECT
tablename,
attname,
correlation,
index_name,
indexdef
FROM (
SELECT
indisclustered,
index_name,
table_name,
indexdef
FROM (
SELECT
indisclustered,
index_name,
cls.relname AS table_name
FROM
pg_class cls
INNER JOIN (
SELECT
indexrelid,
indrelid,
indisclustered,
relname AS index_name
FROM pg_index ndx
INNER JOIN pg_class cls
ON cls.oid=indexrelid
WHERE indisclustered='t'
) tmp
ON cls.oid=tmp.indrelid
) clustering
INNER JOIN pg_indexes indxs
ON indxs.tablename=clustering.table_name AND
indxs.indexname=clustering.index_name
) indexes
INNER JOIN
pg_stats
ON indexes.table_name=pg_stats.tablename AND indexes.indexdef LIKE CONCAT('%', pg_stats.attname, '%')
;
- 111