My colleague John recently blogged about the unexpected results of using the T-SQL IN operator on a set that contains NULL. Here is another solution to the problem.There are three main ways to test for existence in another table, listed here in order of increasing performance
- LEFT JOIN and look for NULLs in the right table
- Use the IN operator
- Use the EXISTS operator
In general, the EXISTS operator will generate the most efficient query plan, and it also isn't subject to the very NULL pitfall that plagues the IN operator. The query that John posted:
SELECT COUNT(*)
FROM TableTarget
WHERE PrimaryKeyField NOT IN (
SELECT ForeignKeyField
FROM TableSource
WHERE ForeignKeyField IS NOT NULL
)
Could be writtten like this:
SELECT COUNT(*)
FROM TableTarget t
WHERE NOT EXISTS (
SELECT *
FROM TableSource
WHERE ForeignKeyField = t.PrimaryKeyField
)