IN and EXISTS Subqueries

Scalar subqueries return a single value. But often you need to check whether a value matches any value in a set or whether related rows exist in another table. That is where IN and EXISTS come in.

The IN Operator with Subqueries

The IN operator checks whether a value matches any value in a list. You have probably used it with a hardcoded list:

SELECT * FROM employees WHERE department IN ('Engineering', 'Sales');

But the real power comes from using a subquery to generate that list dynamically:

SELECT name, department
FROM employees
WHERE department IN (
  SELECT name FROM departments WHERE budget > 250000
);

This finds employees who belong to departments with budgets over 250,000. The subquery returns a list of department names, and the outer query checks each employee's department against that list.

Step-by-Step Execution

Step 1: Run the subquery
  SELECT name FROM departments WHERE budget > 250000
  Result: 'Engineering', 'Sales'

Step 2: Substitute the result into the outer query
  SELECT name, department FROM employees
  WHERE department IN ('Engineering', 'Sales')

Step 3: Return matching rows

NOT IN

NOT IN returns rows where the value does not match any value in the list:

SELECT name, department
FROM employees
WHERE department NOT IN (
  SELECT name FROM departments WHERE budget > 250000
);

This finds employees in departments with budgets of 250,000 or less.

The NULL Trap with NOT IN

Be careful: if the subquery returns any NULL values, NOT IN will return no rows at all. This is because comparing anything to NULL yields UNKNOWN, and the entire NOT IN expression becomes UNKNOWN:

-- If this subquery returns: ('Engineering', NULL, 'Sales')
-- Then NOT IN returns NO ROWS -- probably not what you wanted!

-- Safer version: filter out NULLs
SELECT name FROM employees
WHERE department NOT IN (
  SELECT name FROM departments WHERE name IS NOT NULL
);

For this reason, many experienced developers prefer NOT EXISTS over NOT IN.

The EXISTS Operator

EXISTS is different from IN. It does not compare values -- it simply checks whether the subquery returns at least one row:

SELECT name, department
FROM employees e
WHERE EXISTS (
  SELECT 1
  FROM departments d
  WHERE d.name = e.department
    AND d.budget > 250000
);

Notice that:

  • The subquery references e.department from the outer query -- this makes it a correlated subquery (the subquery depends on the current row of the outer query)
  • SELECT 1 is used because EXISTS does not care what columns are returned, only whether any rows exist
  • EXISTS stops searching as soon as it finds the first match, which can be more efficient than IN for large datasets

How EXISTS Works

For each employee row in the outer query:
  1. Run the subquery with that employee's department
  2. If the subquery returns at least one row --> include this employee
  3. If the subquery returns zero rows --> exclude this employee

NOT EXISTS

NOT EXISTS is the opposite -- it returns TRUE when the subquery finds no matching rows:

-- Find customers who have never placed an order
SELECT c.name, c.email
FROM customers c
WHERE NOT EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_name = c.name
);

This is the recommended pattern for "find rows without related rows" queries. Unlike NOT IN, it handles NULLs correctly.

Comparing IN vs EXISTS

When to Use IN

  • When the subquery returns a small, fixed set of values
  • When comparing against a simple list
  • When the subquery does not need to reference the outer query
-- Good use of IN: simple list lookup
SELECT * FROM employees
WHERE department IN (SELECT name FROM departments);

When to Use EXISTS

  • When you need to check for the existence of related rows
  • When the subquery needs to reference the outer query (correlated)
  • When working with large datasets (EXISTS can short-circuit)
  • When NULLs might be present (EXISTS handles them safely)
-- Good use of EXISTS: checking for related rows
SELECT * FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_name = c.name
  AND o.status = 'completed'
);

Practical Examples

Find Customers Who Have Placed Orders

Using IN:

SELECT name, email, city
FROM customers
WHERE name IN (SELECT customer_name FROM orders);

Using EXISTS:

SELECT c.name, c.email, c.city
FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders o WHERE o.customer_name = c.name
);

Both return the same result, but EXISTS is generally preferred for this pattern.

Find Departments That Have At Least One High-Earning Employee

SELECT name, budget
FROM departments d
WHERE EXISTS (
  SELECT 1
  FROM employees e
  WHERE e.department = d.name
    AND e.salary > 80000
);

Find Customers Who Only Ordered Completed Orders

SELECT c.name
FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_name = c.name
)
AND NOT EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_name = c.name
    AND o.status != 'completed'
);

This combines EXISTS and NOT EXISTS: the customer must have at least one order AND must have no non-completed orders.

IN with Multiple Values (Tuple Comparison)

You can also use IN with multiple columns using row constructors:

SELECT *
FROM employees
WHERE (department, is_active) IN (
  SELECT name, TRUE
  FROM departments
  WHERE budget > 200000
);

This checks both columns simultaneously.

Try It Yourself

  1. Use an IN subquery to find all employees who work in departments located in 'Building A' (hint: look up locations in the departments table).

  2. Use EXISTS to find all customers who have placed at least one order with a status of 'pending'.

  3. Use NOT EXISTS to find all customers who have never placed an order. Compare this to writing the same query with NOT IN -- what differences do you notice?

  4. Find departments that have no employees earning below 60,000 using NOT EXISTS.

Take Quiz Next Lesson
SQL Editor Ctrl+Enter to run
Results
Run a query to see results here
Tables: