What's the difference between a subquery, a derived table, and a CTE?
Quick Answer
A **subquery** is any `SELECT` nested inside another statement (in `WHERE`, `SELECT`, or `FROM`). A **derived table** is specifically a subquery used in the `FROM` clause, given an alias, and treated as a temporary named result set. A **CTE** (`WITH name AS (...)`) is a named, top-level query that can be referenced one or more times in the main query, and can optionally be recursive — mostly a readability/reusability improvement over a derived table, though optimizer behavior around materialization varies by engine.
Detailed Answer
These three terms describe overlapping ways of nesting one query inside another, distinguished mainly by where they appear and how reusable they are.
Subquery — the general term
Any SELECT nested inside another SQL statement:
-- In WHERE (scalar/list subquery)
SELECT * FROM employees WHERE department_id IN (SELECT id FROM departments WHERE region = 'EU');
-- In SELECT (scalar subquery, must return exactly one value)
SELECT name, (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) AS order_count
FROM customers c;
Derived table — a subquery used as a table
When a subquery appears in FROM and is given an alias, it's specifically called a derived table (or "inline view"):
SELECT dept_avg.department, dept_avg.avg_salary
FROM (
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
) AS dept_avg -- <-- the alias is mandatory in most engines
WHERE dept_avg.avg_salary > 80000;
Derived tables must be aliased, are scoped only to that one query, and can't be referenced more than once without repeating the whole subquery.
CTE — a named, top-level, (optionally) reusable query
WITH dept_avg AS (
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
)
SELECT e.name, e.salary, d.avg_salary
FROM employees e
JOIN dept_avg d ON e.department = d.department
WHERE e.salary > d.avg_salary;
Advantages over a derived table:
- Readability — complex queries read top-to-bottom instead of nesting inward.
- Reusability within one statement — the same CTE can be joined against multiple times without repeating its definition.
- Recursion —
WITH RECURSIVElets a CTE reference itself, which is impossible with a plain derived table (used for hierarchies/graphs — see the recursive CTE question).
Materialization caveat
Historically, PostgreSQL always materialized (fully computed and stored) CTE results before using them, which could hurt performance versus an equivalent derived table the optimizer could inline and push predicates into. As of PostgreSQL 12, CTEs are inlined by default (like a derived table) unless marked MATERIALIZED, referenced multiple times, or recursive. Always check your engine's current CTE optimization behavior rather than assuming — it's one of the more version-sensitive areas of SQL.