-1

I'm building an application that needs to be deployed several times, with a relatively well organized data model. A table needs to be configured for each deployment with a small number of additional fields, different each time. Performance is not an issue, as we're talking about a few hundreds of rows and a dozen of additional fields.

I'm inclined to keep a unique, generic data model, and thus I'm trying (maybe naively) to use something like an EAV model for that table as such:

  • user: table has common fields
  • user_data: user_id, field_id, field_value
  • field: id, field_name, field_type

My intention was then to crosstab the user_data table to have a table such as user_id, field1, field2, field3, ... that I could then join to my user table.

However, crosstab requires a custom type for the output of the pivot, with all field names and types that will go in column. I have all the information stored in the field table.

My question is then: can I create a custom type dynamically in PostgreSQL, from the information I store in my field table?

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
phme
  • 1
  • 2
  • What is the operation list with these values, except store and SELECT. JOIN ON? ORDER BY? WHERE? GROUP BY? SUM/CONCAT? – Akina Jul 24 '19 at 17:25
  • Mostly WHERE and GROUP BY. – phme Jul 24 '19 at 18:05
  • Does you need crosstab exactly? Maybe combining into JSON object with field:value list for each user is safe? it can be easily performed using JSON object creation aggregate function, and this variant does not need in the list of properties. – Akina Jul 24 '19 at 18:23

2 Answers2

1

Use identical table definitions with a single document type column of type json, jsonb or hstore to store additional key/value pairs for the few columns that differ between installations. That's often the most efficient solution if it needs to be dynamic / flexible.

If the total number of dynamic columns is not more than a few dozen, you might even just implement an identical superset of columns for every installation and only fill the ones in use with actual values. The rest can stay NULL. NULL storage is cheap in Postgres, basically 1 bit per field.

Related:

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • Interesting idea, thanks, but for now, I managed with dynamic SQL. Not sure it's the most practical way in the long run, but it works for now. – phme Jul 25 '19 at 04:29
0

Certainly not the most elegant way, but it works with dynamic sql:

DO $$
DECLARE 
  sql1 text := '''SELECT user_id AS row_name, field_name AS category, field_value AS value 
    FROM user_data a JOIN field b ON a.field_id = b.id''';
  sql2 text := '''SELECT field_name FROM field ORDER BY 1''';
  sql3 text;
  sql4 text;
BEGIN
  SELECT string_agg(v, ', ') FROM (
    SELECT CONCAT('"', field_name, '" ', field_type) AS v FROM field ORDER BY field_name)
  x INTO sql3;
  sql4 := CONCAT('CREATE TEMP TABLE tmp_pdata AS SELECT * FROM crosstab(', 
                 sql1, ', ', sql2, ') AS ct(row_name int, ', sql3, ');');
  EXECUTE sql4;
END $$;
SELECT * FROM tmp_data;

All fields are typed it's reasonably fast with the small amount of data I have (less than 10k lines to pivot, not expected to grow fast)

phme
  • 1
  • 2