We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
IN and EXISTS Subqueries
Scalar subqueries return a single value. But often you need to check whether a value matches any value in a set or whether related rows exist in another table. That is where IN and EXISTS come in.
The IN Operator with Subqueries
The IN operator checks whether a value matches any value in a list. You have probably used it with a hardcoded list:
SELECT * FROM employees WHERE department IN ('Engineering', 'Sales');
But the real power comes from using a subquery to generate that list dynamically:
SELECT name, department
FROM employees
WHERE department IN (
SELECT name FROM departments WHERE budget > 250000
);
This finds employees who belong to departments with budgets over 250,000. The subquery returns a list of department names, and the outer query checks each employee's department against that list.
Step-by-Step Execution
Step 1: Run the subquery
SELECT name FROM departments WHERE budget > 250000
Result: 'Engineering', 'Sales'
Step 2: Substitute the result into the outer query
SELECT name, department FROM employees
WHERE department IN ('Engineering', 'Sales')
Step 3: Return matching rows
NOT IN
NOT IN returns rows where the value does not match any value in the list:
SELECT name, department
FROM employees
WHERE department NOT IN (
SELECT name FROM departments WHERE budget > 250000
);
This finds employees in departments with budgets of 250,000 or less.
The NULL Trap with NOT IN
Be careful: if the subquery returns any NULL values, NOT IN will return no rows at all. This is because comparing anything to NULL yields UNKNOWN, and the entire NOT IN expression becomes UNKNOWN:
-- If this subquery returns: ('Engineering', NULL, 'Sales')
-- Then NOT IN returns NO ROWS -- probably not what you wanted!
-- Safer version: filter out NULLs
SELECT name FROM employees
WHERE department NOT IN (
SELECT name FROM departments WHERE name IS NOT NULL
);
For this reason, many experienced developers prefer NOT EXISTS over NOT IN.
The EXISTS Operator
EXISTS is different from IN. It does not compare values -- it simply checks whether the subquery returns at least one row:
SELECT name, department
FROM employees e
WHERE EXISTS (
SELECT 1
FROM departments d
WHERE d.name = e.department
AND d.budget > 250000
);
Notice that:
- The subquery references
e.departmentfrom the outer query -- this makes it a correlated subquery (the subquery depends on the current row of the outer query) SELECT 1is used because EXISTS does not care what columns are returned, only whether any rows exist- EXISTS stops searching as soon as it finds the first match, which can be more efficient than IN for large datasets
How EXISTS Works
For each employee row in the outer query:
1. Run the subquery with that employee's department
2. If the subquery returns at least one row --> include this employee
3. If the subquery returns zero rows --> exclude this employee
NOT EXISTS
NOT EXISTS is the opposite -- it returns TRUE when the subquery finds no matching rows:
-- Find customers who have never placed an order
SELECT c.name, c.email
FROM customers c
WHERE NOT EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_name = c.name
);
This is the recommended pattern for "find rows without related rows" queries. Unlike NOT IN, it handles NULLs correctly.
Comparing IN vs EXISTS
When to Use IN
- When the subquery returns a small, fixed set of values
- When comparing against a simple list
- When the subquery does not need to reference the outer query
-- Good use of IN: simple list lookup
SELECT * FROM employees
WHERE department IN (SELECT name FROM departments);
When to Use EXISTS
- When you need to check for the existence of related rows
- When the subquery needs to reference the outer query (correlated)
- When working with large datasets (EXISTS can short-circuit)
- When NULLs might be present (EXISTS handles them safely)
-- Good use of EXISTS: checking for related rows
SELECT * FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.customer_name = c.name
AND o.status = 'completed'
);
Practical Examples
Find Customers Who Have Placed Orders
Using IN:
SELECT name, email, city
FROM customers
WHERE name IN (SELECT customer_name FROM orders);
Using EXISTS:
SELECT c.name, c.email, c.city
FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.customer_name = c.name
);
Both return the same result, but EXISTS is generally preferred for this pattern.
Find Departments That Have At Least One High-Earning Employee
SELECT name, budget
FROM departments d
WHERE EXISTS (
SELECT 1
FROM employees e
WHERE e.department = d.name
AND e.salary > 80000
);
Find Customers Who Only Ordered Completed Orders
SELECT c.name
FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.customer_name = c.name
)
AND NOT EXISTS (
SELECT 1 FROM orders o
WHERE o.customer_name = c.name
AND o.status != 'completed'
);
This combines EXISTS and NOT EXISTS: the customer must have at least one order AND must have no non-completed orders.
IN with Multiple Values (Tuple Comparison)
You can also use IN with multiple columns using row constructors:
SELECT *
FROM employees
WHERE (department, is_active) IN (
SELECT name, TRUE
FROM departments
WHERE budget > 200000
);
This checks both columns simultaneously.
Try It Yourself
-
Use an IN subquery to find all employees who work in departments located in 'Building A' (hint: look up locations in the departments table).
-
Use EXISTS to find all customers who have placed at least one order with a status of 'pending'.
-
Use NOT EXISTS to find all customers who have never placed an order. Compare this to writing the same query with NOT IN -- what differences do you notice?
-
Find departments that have no employees earning below 60,000 using NOT EXISTS.