157

I have a table like the following:

create table my_table (
    id   int8 not null,
    id_A int8 not null,
    id_B int8 not null,
    id_C int8 null,
    constraint pk_my_table primary key (id),
    constraint u_constrainte unique (id_A, id_B, id_C)
);

And I want (id_A, id_B, id_C) to be distinct in any situation. So the following two inserts must result in an error:

INSERT INTO my_table VALUES (1, 1, 2, NULL);
INSERT INTO my_table VALUES (2, 1, 2, NULL);

But it doesn't behave as expected because according to the documentation, two NULL values are not compared to each other, so both inserts pass without error.

How can I guarantee my unique constraint even if id_C can be NULL in this case? Actually, the real question is: can I guarantee this kind of uniqueness in "pure sql" or do I have to implement it on a higher level (java in my case)?

Manuel Leduc
  • 1,681
  • 2
  • 11
  • 5
  • So, say you have values (1,2,1) and (1,2,2) in the (A,B,C) columns. Should a (1,2,NULL) be allowed to be added or not? – ypercubeᵀᴹ Dec 27 '11 at 09:27
  • A and B can't be null but C can be null or any positive integer value. So (1,2,3) and (2,4,null) are valid but (null,2,3) or (1,null,4) are invalid. And [(1,2,null), (1,2,3)] does not break unique constraint but [(1,2, null), (1,2,null)] must break it. – Manuel Leduc Dec 27 '11 at 09:39
  • 2
    Are there any values that will never appear in those columns (like negative values?) –  Dec 27 '11 at 10:43
  • 1
    You don't have to label your constraints in pg. It'll automagically generate a name. Just FYI. – Evan Carroll Dec 02 '16 at 20:13
  • Replace the null values with a default non-null value fix the unique issue, e.g for varchar use empty string ''. – Eric Apr 27 '21 at 02:54
  • Thanks for this question. Helped a lot. – Rajesh Paul Jan 05 '24 at 08:13

3 Answers3

174

Postgres 15

This works out of the box with NULLS NOT DISTINCT:

ALTER TABLE my_table
  DROP CONSTRAINT IF EXISTS u_constrainte
, ADD CONSTRAINT u_constrainte UNIQUE NULLS NOT DISTINCT (id_A, id_B, id_C);

See:

Postgres 14 or older (original answer)

You can do that in pure SQL. Create a partial unique index in addition to the one you have:

CREATE UNIQUE INDEX ab_c_null_idx ON my_table (id_A, id_B) WHERE id_C IS NULL;

This way you can enter for (id_A, id_B, id_C) in your table:

(1, 2, 1)
(1, 2, 2)
(1, 2, NULL)

But none of these a second time.

Or use two partial UNIQUE indexes and no complete index (or constraint). The best solution depends on the details of your requirements. Compare:

While this is elegant and efficient for a single nullable column in the UNIQUE index, it gets out of hand quickly for more than one. Discussing this - and how to use UPSERT with partial indexes:

Asides

No use for mixed case identifiers without double quotes in PostgreSQL.

You might consider a serial column as primary key or an IDENTITY column in Postgres 10 or later. Related:

So:

CREATE TABLE my_table (
   my_table_id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY  -- for pg 10+
-- my_table_id bigserial PRIMARY KEY  -- for pg 9.6 or older
 , id_a int8 NOT NULL
 , id_b int8 NOT NULL
 , id_c int8
 , CONSTRAINT u_constraint UNIQUE (id_a, id_b, id_c)
);

If you don't expect more than 2 billion rows (> 2147483647) over the lifetime of your table (including waste and deleted rows), consider integer (4 bytes) instead of bigint (8 bytes).

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • 1
    The docs advocate this method, Adding a unique constraint will automatically create a unique B-tree index on the column or group of columns listed in the constraint. A uniqueness restriction covering only some rows cannot be written as a unique constraint, but it is possible to enforce such a restriction by creating a unique partial index. – Evan Carroll Dec 02 '16 at 20:21
19

I had the same problem and I found another way to have unique NULL into the table.

CREATE UNIQUE INDEX index_name ON table_name( COALESCE( foreign_key_field, -1) )

In my case, the field foreign_key_field is a positive integer and will never be -1.

So, to answer Manual Leduc, another solution could be

CREATE UNIQUE INDEX  u_constrainte (COALESCE(id_a, -1), COALESCE(id_b,-1),COALESCE(id_c, -1) )

I assume that ids won't be -1.

What is the advantage on creating a partial index ?
In case where you don't have the NOT NULL clause, id_a, id_b and id_c can be NULL together only once.
With a partial index, the 3 fields could be NULL more than once.

Luc M
  • 589
  • 1
  • 7
  • 13
  • 6

    What is the advantage on creating a partial index ?

    The way you've done it with COALESCE can be effective in restricting the duplicates, but the index wouldn't be very useful in querying as its an expression index that probably won't match query expressions. That is, unless you SELECT COALESCE(col, -1) ... you wouldn't be hitting the index.

    – Bo Jeanes Aug 05 '16 at 04:58
  • @BoJeanes The index has not been created for a performance issue. It has been created to fullfill the business requirement. – Luc M Aug 05 '16 at 19:51
9

A Null can mean that value is not known for that row at the moment but will be added, when known, in the future (example FinishDate for a running Project) or that no value can be applied for that row (example EscapeVelocity for a black hole Star).

In my opinion, it's usually better to normalize the tables by eliminating all Nulls.

In your case, you want to allow NULLs in your column, yet you want only one NULL to be allowed. Why? What kind of relationship is this between the two tables?

Perhaps you can simply change the column to NOT NULL and store, instead of NULL, a special value (like -1) that is known never to appear. This will solve the uniqueness constraint problem (but may have other possibly unwanted side effects. For example, using -1 to mean "not known / does not apply" will skew any sum or average calculations on the column. Or all such calculations will have to take into account the special value and ignore it.)

ypercubeᵀᴹ
  • 97,895
  • 13
  • 214
  • 305
  • This is is a good point. Often when you are trying to limit a column to one and only one null, you're abusing null, and using it like a value. – Scott Marlowe Dec 28 '11 at 06:13
  • 2
    In my case NULL is really NULL (id_C is a foreign key to table_c for exemple so it can't have -1 value), it means their is no relationship between "my_table" and "table_c". So it has a functional signification. By the way [(1, 1,1,null), (2, 1,2,null), (3,2,4,null)] is a valid list of inserted data. – Manuel Leduc Dec 28 '11 at 09:40
  • 1
    It's not really a Null as used in SQL because you want only one in all rows. You could change your database schema either by adding the -1 to table_c or by adding another table (which would be supertype to subtype table_c). – ypercubeᵀᴹ Dec 28 '11 at 09:48
  • 3
    I'd just like to point out to @Manuel that the opinion on nulls in this answer is not universally held, and is much debated. Many, like me, think that null can be used for any purpose you wish (but should only mean one thing for each field and be documented, possibly in the field name or a column comment) – Jack Douglas Dec 29 '11 at 07:03
  • 1
    You can't use a dummy value when your column is a FOREIGN KEY. – Luc M May 17 '12 at 18:42
  • @LucM: I certainly agree on that. OP states in a comment that "... but [(1,2, null), (1,2,null)] must break unique constraint." My point is that he is trying to use NULL as a special magic value (with some special meaning that he never clarified). My suggestion was to actually add such a magic value in the referenced table. Also note that I suggest this as a 2nd approach. My first suggestion was to eliminate the Nulls. – ypercubeᵀᴹ May 17 '12 at 18:52
  • @ypercube I'm facing the same problem as OP. In my case, NULL means no value but I must have only one 'no value'. I thought using a magic value as you suggested. But I can't add a magic value into the referenced table and the column must be a value that exists in referenced table if there is value. I was deseperated... I thought using a trigger... And I found this thread! Thanks to stackexchange!! Really, how people were working before Internet ? :-) – Luc M May 17 '12 at 19:02
  • @LucM: I see the solution by Erwin as a workaround - an excellent workaround - if you don't want to remove the nullable columns by normalizing. – ypercubeᵀᴹ May 17 '12 at 19:08
  • 1
    +1 I am with you: if we want some combination of columns to be unique, then you need to consider an entity in which this combination of columns is a PK. The OPs' database schema should probably change to a parent table and a child one. – A-K Nov 11 '13 at 22:23