Why does column order matter in a composite (multi-column) index?

6 minadvancedindexingcomposite-indexcolumn-order

Quick Answer

A composite index is sorted by its first column, then its second within each value of the first, and so on — like a phone book sorted by last name, then first name. A query can only use the index efficiently for a **left-to-right prefix** of the indexed columns: an index on `(a, b, c)` helps queries filtering on `a`, or on `a AND b`, or on `a AND b AND c`, but does not efficiently help a query filtering on `b` alone or `c` alone.

Detailed Answer

CREATE INDEX ix_orders_customer_status_date
ON orders(customer_id, status, order_date);

The phone book analogy

Think of this index like a phone book sorted by (last name, first name). You can efficiently find "everyone named Smith," or "everyone named Smith, John" — but you cannot efficiently find "everyone whose first name is John" without scanning the whole book, because first names aren't sorted independent of last name.

Which queries this index helps

-- Efficient: uses the full 3-column prefix
WHERE customer_id = 42 AND status = 'shipped' AND order_date > '2024-01-01'

-- Efficient: uses the leading 2-column prefix (order_date unconstrained is fine)
WHERE customer_id = 42 AND status = 'shipped'

-- Efficient: uses just the leading column
WHERE customer_id = 42

-- NOT efficiently helped: doesn't start with customer_id
WHERE status = 'shipped'
WHERE order_date > '2024-01-01'

-- Partially helped: customer_id can use the index, but status is skipped
-- (this is a "range then unordered" scenario -- order_date range breaks the
-- ability to also use status as a further seek predicate in most engines)
WHERE customer_id = 42 AND order_date > '2024-01-01'

Equality columns before range columns

A useful rule of thumb when deciding column order: put columns used with equality (=) before columns used with a range (>, <, BETWEEN, LIKE 'prefix%'). Once the index encounters a range condition, it can still narrow down using that range, but everything after it in the index can no longer be used to further narrow the search within that range efficiently — the equality columns should exhaust their filtering power first.

-- Good: status (equality) before order_date (range)
CREATE INDEX ix_orders_status_date ON orders(status, order_date);
WHERE status = 'shipped' AND order_date > '2024-01-01'   -- both columns pull weight

-- Worse: order_date (range) before status (equality)
CREATE INDEX ix_orders_date_status ON orders(order_date, status);
WHERE status = 'shipped' AND order_date > '2024-01-01'   -- status can't narrow further after the range scan begins

Design composite indexes around your actual query patterns, leading with the column(s) most consistently filtered by equality across your hottest queries. It's common (and fine) to need multiple composite indexes with different column orderings if your application has several distinct hot query shapes against the same table — but each additional index has a write-cost tradeoff, so don't create one per query without checking for meaningful overlap first.