What's the difference between UNION and UNION ALL?

3 minbeginnersql-basicsunionset-operations

Quick Answer

Both combine the result sets of two or more `SELECT` statements with the same number/type of columns. `UNION` removes duplicate rows across the combined set, which requires an implicit sort or hash-based dedup step. `UNION ALL` keeps every row, including duplicates, and is significantly cheaper since it skips deduplication. Prefer `UNION ALL` whenever you know the inputs are already disjoint or duplicates are acceptable.

Detailed Answer

-- UNION: deduplicates the combined result set
SELECT city FROM current_customers
UNION
SELECT city FROM past_customers;

-- UNION ALL: keeps every row, including duplicates
SELECT city FROM current_customers
UNION ALL
SELECT city FROM past_customers;

Requirements

Every SELECT in the union must produce the same number of columns, in the same order, with compatible data types. Column names in the output come from the first SELECT.

Performance

UNION has to identify and remove duplicates, which typically means sorting the combined result or building a hash set — an extra pass that costs memory and CPU proportional to the result size. UNION ALL just concatenates the result sets with no extra work.

Rule of thumb: default to UNION ALL unless you specifically need deduplication. A common mistake is reflexively using UNION "to be safe" on two queries that can never produce overlapping rows (e.g., querying two mutually exclusive partitions), paying for a sort that can never actually remove anything.

Ordering the final result

ORDER BY can only appear once, at the end, and applies to the combined result:

SELECT city, 'current' AS source FROM current_customers
UNION ALL
SELECT city, 'past' AS source FROM past_customers
ORDER BY city;