2

I received a list of column names from another team. To ensure all field column names are valid inside a specific schema, I put them into a temporary table. I'm now wanting to see if they exist. When I try to join with information_schema.columns, I received errors below:

ERROR: 0A000: Specified types or functions (one per INFO message) not supported on Redshift tables.
Column "c.column_name" has unsupported type     "information_schema.sql_identifier".
Column "a.*" has unsupported type "pg_attribute".
Column "t.*" has unsupported type "pg_type".
Function "format_type(oid,integer)" not supported.
Function "format_type(oid,integer)" not supported.
Function "has_table_privilege(oid,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.

I tried to cast column_name into different types, but had no luck. Would anyone please advise what's wrong and how I can achieve this goal?

Evan Carroll
  • 63,051
  • 46
  • 242
  • 479
Lee
  • 205
  • 1
  • 3
  • 12
  • You're defining column name validity by not being anywhere in your schema? – Evan Carroll Jan 15 '18 at 05:10
  • Yes, to avoid typo. – Lee Jan 15 '18 at 05:12
  • I'm fairly certain I understand what's going on here so I cut out the information that I thought was unnecessary since I believe this is otherwise a really good question and likely to be pretty hot. – Evan Carroll Jan 15 '18 at 06:12
  • @lee you should update your question title to "How do I query INFORMATION_SCHEMA.COLUMNS on Amazon Redshift?" - Note that RDS <> Redshift. – Jon Scott Jan 23 '18 at 12:01

1 Answers1

5

PG_TABLE_DEF

Amazon considers the internal functions that INFORMATION_SCHEMA.COLUMNS is using Leader-Node Only functions. Rather than being sensible and redefining the standardized INFORMATION_SCHEMA.COLUMNS, Amazon sought to define their own proprietary version. For that they made available another function PG_TABLE_DEF which seems to address the same need. Pay attention to the note in the center about adding the schema to search_path.

Stores information about table columns.

PG_TABLE_DEF only returns information about tables that are visible to the user. If PG_TABLE_DEF does not return the expected results, verify that the search_path parameter is set correctly to include the relevant schemas.

You can use SVV_TABLE_INFO to view more comprehensive information about a table, including data distribution skew, key distribution skew, table size, and statistics.

So using your example code (rewritten to use NOT EXISTS for clarity),

SET SEARCH_PATH to '$user', 'public', 'target_schema';

SELECT "column" 
FROM dev.fields f
WHERE NOT EXISTS (
  SELECT 1
  FROM PG_TABLE_DEF pgtd
  WHERE pgtd.column = f.field
  AND schemaname = 'target_schema'
);

See also,

Evan Carroll
  • 63,051
  • 46
  • 242
  • 479
  • Evan, thanks a lot for the detailed explanation and that really works for me! On a side note, why do you think NOT EXISTS is better than LEFT JOIN? I worked most with mssql and oracle where "NOT" is generally hated, so am trying to appreciate this change. – Lee Jan 15 '18 at 06:17
  • @Lee I think you're confusing NOT IN with NOT EXISTS. People tend to avoid NOT IN because it behaves in unexpected ways when NULLs are present. – Erik Darling Jan 15 '18 at 13:13
  • @Lee NOT EXISTS is more explicit semantically and the binding is closer. That WHERE col IS NULL will work its way quite a bit when you add in a lot of JOINS. That means you have to maintain two discrete parts of your query to get the same functionality. – Evan Carroll Jan 15 '18 at 16:18
  • @sp_BlitzErik, you are so right. I typed NOT EXISTS but was actually thinking of NOT IN. – Lee Jan 16 '18 at 22:34
  • 1
    @EvanCarroll, thanks again for your great explanation. – Lee Jan 16 '18 at 22:34
  • @Lee see this for more information https://dba.stackexchange.com/a/195447/2639 – Evan Carroll Jan 16 '18 at 22:36