25

I had a query (for Postgres and Informix) with a NOT IN clause containing a subquery that in some cases returned NULL values, causing that clause (and the entire query) to fail to return anything.

What's the best way to understand this? I thought of NULL as something without a value, and therefore wasn't expecting the query to fail, but obviously that's not the correct way to think of NULL.

newenglander
  • 1,055
  • 5
  • 13
  • 23

1 Answers1

36

Boolean logic - or Three valued logic

  • IN is shorthand for a series of OR conditions
  • x NOT IN (1, 2, NULL) is the same as NOT (x = 1 OR x = 2 OR x = NULL)
  • ... is the same as x <> 1 AND x <> 2 AND x <> NULL
  • ... is the same as true AND true AND unknown **
  • ... = unknown **
  • ... which is almost the same as false in this case as it will not pass the WHERE condition **

Now, this is why folk use EXISTS + NOT EXISTS rather than IN + NOT IN. Also see The use of NOT logic in relation to indexes for more

** Note: unknown is the same as false at the end of an expression in a WHERE condition.
While the expression is being evaluated, then it is unknown
See @kgrittn's comment below for why

gbn
  • 69,809
  • 8
  • 163
  • 243
  • 10
    Even with the clarification it is technically wrong, and in a way that could burn someone. For example, if you view x <> NULL as resolving to FALSE, you would expect NOT (x <> NULL) to evaluate to TRUE, and it doesn't. Both evaluate to UNKNOWN. The trick is that a row is selected only if the WHERE clause (if present) evaluates to TRUE -- a row is omitted if the clause evaluates to either FALSE or UNKNOWN. This behavior (in general, and for the NOT IN predicate in particular) is mandated by the SQL standard. – kgrittn May 03 '12 at 14:35
  • Also NULL NOT IN (some_subquery) should not return the outer row except if some_subquery doesn't return any rows. Which is why the execution plan when both columns are Null-able can be considerably more expensive. SQL Server Example – Martin Smith Jun 18 '12 at 07:08