SQL: dealing with NULLs and Coalesce

SQL uses a three valued logic: True, False and Unknown

id > NULL => unknown
NULL = NULL => unknown
id IS NULL => true/false

And now some trivia to keep in mind.

All the fishes that might be yellow

SELECT * FROM fish WHERE color='yellow' OR color IS NULL

or equivalently

SELECT * FROM fish WHERE (color='yellow') IS NOT FALSE

All the users that might have written a post

SELECT * FROM user JOIN post 
ON (user.id = post.user) IS NOT FALSE

So if one of the fields IS NULL, it joins (what a mess).

Pay attention when using NOT IN that even if a single NULL is in there, the result is always the empty set.

SELECT id FROM post WHERE id NOT IN (1,2,3,NULL) // = { }

(you can never be sure that post.id is not in the set). Thefore if the column targeted by the NOT IN is nullable, you probably want to add some extra constraints. This leads us to the next example:

All emails which might not be spam

SELECT id FROM email WHERE email.id NOT IN 
    (SELECT id FROM spam WHERE spam.id IS NOT NULL)

i.e. get rid of the emails which are spam for sure. Note that to perform a meaningful NOT IN you have to filter out the NULLs first. This is a meaningful thing to do regardless of the mighty-aspect of the query.

COALESCE is a nice function that returns the first non-null expression among its arguments. It comes in handy when you want to avoid at all costs to have some NULLs returned.

SELECT COALESCE (rate, rate2, 'zero') FROM table;

Remember that GROUP BY groups all the NULLs together!