EXISTS vs IN vs JOIN — when do you use each? | SQL & Databases Interview Question

Detailed Answer

All three can express "give me rows in A related to rows in B," but they answer different underlying questions.

JOIN — when you need columns from both tables

SELECT c.name, o.total
FROM customers c
JOIN orders o ON o.customer_id = c.id;

If the goal is to display or aggregate data from both tables, JOIN is the natural tool — IN/EXISTS can't return columns from the subquery side.

IN — membership against a list

SELECT * FROM customers
WHERE id IN (SELECT customer_id FROM orders WHERE total > 1000);

Simple and readable for "customers who have at least one qualifying order," when you don't need any order columns in the output. Danger: if the subquery can return a NULL (e.g., customer_id is nullable and some order rows have it unset) and you're using NOT IN, the entire result silently becomes empty due to three-valued logic (see the NULL question) — this is one of the most common real-world SQL correctness bugs.

EXISTS — pure existence check, NULL-safe

SELECT * FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.id AND o.total > 1000
);

EXISTS only cares whether the subquery returns any row — the SELECT 1 is idiomatic because the actual selected value is irrelevant. It's NULL-safe by construction and, critically, the safe choice for negation:

-- Safe even if orders.customer_id can be NULL
SELECT * FROM customers c
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);

-- Unsafe: silently returns zero rows if any orders.customer_id is NULL
SELECT * FROM customers c
WHERE c.id NOT IN (SELECT customer_id FROM orders);

Duplication behavior

A JOIN can multiply output rows if a customer has multiple matching orders (one output row per match), which then requires DISTINCT or aggregation to get back to one row per customer. IN/EXISTS never duplicate the outer row regardless of how many subquery rows match, since they're boolean filters, not row-multiplying joins — this is often the deciding factor when you just need a yes/no filter, not order details.

Need columns from the related table → JOIN.
Need a plain existence/membership filter and the subquery result can't contain NULL, or you're doing a positive check → IN is fine and often reads well for short static lists.
Need a negative existence check (NOT ...), or the subquery column might contain NULL → always use NOT EXISTS, never NOT IN.

EXISTS vs IN vs JOIN — when do you use each?

Quick Answer

Detailed Answer

JOIN — when you need columns from both tables

IN — membership against a list

EXISTS — pure existence check, NULL-safe

Duplication behavior

Related Resources

Related Questions