1

I am drawn to this design like a moth to a flame. I've read rumblings that cyclic foreign keys are a nightmare. Generally, I can see why they should be avoided. In this particular case, however, I don't see why it would be so awful. Got some inspiration from this article, but combined the reverse-belongs-to and exclusive-belongs-to designs into one... monstrous creation?

Can you tell me why (or if) this is a bad idea? Like what headaches am I going to get into with this and are they worth it?

Anyway, I want to create a polymorphic relationship between many entities and one table. I want to make something like this:

-- poly ids are uuids and we assume they will not collide, period
create domain poly_id uuid;
create table foo ([...table columns], poly_id not null default gen_random_uuid())
create table bar ([...table columns], poly_id not null default gen_random_uuid())
create table baz ([...table columns], poly_id not null default gen_random_uuid())

create type poly_t as enum ('foo', 'bar', 'baz') create table poly_obj ( -- poly_id is always the poly_id of the one set reference column poly_id poly_id not null generated always as ( coalesce("foo", "bar", "baz") ), poly_t poly_t not null, "foo" poly_id null references foo (poly_id) check ( "foo" is null or poly_t = 'foo' ), "bar" poly_id null references bar (poly_id) check ( "bar" is null or poly_t = 'bar' ), "baz" poly_id null references baz (poly_id) check ( "baz" is null or poly_t = 'baz' ) -- only one fk to child table can be set check ( ( ("foo" is not null)::integer + ("bar" is not null)::integer + ("baz" is not null)::integer ) = 1 ) ) create unique index on poly_obj ("foo") where "foo" is not null; create unique index on poly_obj ("bar") where "bar" is not null; create unique index on poly_obj ("baz") where "baz" is not null;

alter table foo add foreign key (poly_id) references poly_obj (poly_id) alter table bar add foreign key (poly_id) references poly_obj (poly_id) alter table baz add foreign key (poly_id) references poly_obj (poly_id)

-- more pseudocodey than the rest create trigger ___ after insert on foo for each row execute insert into poly_obj (poly_t, foo) select ('foo', poly_id) from new; create trigger ___ after insert on bar for each row execute insert into poly_obj (poly_t, bar) select ('bar', poly_id) from new; create trigger ___ after insert on baz for each row execute insert into poly_obj (poly_t, baz) select ('baz', poly_id) from new;

untitled90
  • 11
  • 1

1 Answers1

0

For only a hand full of types with a hand full of specific columns each, here is my minimalist (and very efficient) approach:

CREATE TABLE poly (
  poly_id    serial PRIMARY KEY  -- or IDENTITY
, poly_type  "char" NOT NULL REFERENCES poly_type  -- alternative: CHECK constraint
, common_col text               --  optional common column
, a_col1     text
, a_col2     text
, b_col3     text
-- more ?
, CONSTRAINT poly_allowed_cols_type_a CHECK (poly_type = 'a' OR (a_col1, a_col2) IS NULL)
, CONSTRAINT poly_allowed_cols_type_b CHECK (poly_type = 'b' OR (b_col3) IS NULL)
-- more ?
);

No triggers, enums, domains. Just the one table. Even the one FK constraint is optional, as we do not strictly need a separate table listing types.

You can add views to get a separate "table" for each type, for convenience.

fiddle -- with a bit more flesh to play around

The core point: Storing null values is very cheap. One bit in the null bitmap. See:

5 types with 5 specific columns each add 20 null values per row, which occupy 20 bit. Less than an integer column (4 bytes = 32 bit). More specifically, at this magnitude no storage is added at all. Past the first 8 columns, chunks of 8 bytes are allocated for the NULL bitmap, which takes care of null values for the next 64 columns.

Obviously, the minimalist design doesn't scale well past a couple 100 columns. (And becomes outright impossible at the theoretical maximum of 1600 columns for a single table.)

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • Hey! This is cool and what a prolific contributor you are. I ended up finding and using a solution of yours just earlier today (https://dba.stackexchange.com/a/203986).

    My use case is wanting (ideally) 2-way referential integrity to refer to rows in N tables across my schemas. I've gotten an implementation that's not too dissimilar to what I posted working. My app's scale is decidedly medium in any anticipateable term, so I'm trying to go for maximum semantic content in my schema rather than max performance. Does your design support 2 way references between multiple tables?

    – untitled90 Aug 10 '23 at 16:56