What is a covering index, and how does it avoid a key lookup?

6 minadvancedindexingcovering-indexperformance

Quick Answer

A covering index is one that includes every column a query needs — both the filter/join columns and the selected output columns — so the engine can satisfy the entire query from the index alone, without a second trip to the base table (a "key lookup"). This is done by including extra columns in the index, either as regular indexed key columns or via an `INCLUDE`/`STORING` clause that stores them at the leaf level without making them part of the sort key.

Detailed Answer

The problem a covering index solves

CREATE INDEX ix_orders_customer_id ON orders(customer_id);

SELECT id, customer_id, total FROM orders WHERE customer_id = 42;

The index on customer_id quickly finds matching rows, but total isn't in that index — so for every matching row, the engine must do an extra key/bookmark lookup back to the full table (or clustered index) just to fetch total. On a query returning many rows, that's many extra random I/Os.

Making the index cover the query

-- Option A: add 'total' as a regular index column
CREATE INDEX ix_orders_customer_covering ON orders(customer_id, total);

-- Option B (SQL Server): INCLUDE non-key columns at the leaf level only
CREATE INDEX ix_orders_customer_covering ON orders(customer_id) INCLUDE (total);

-- Option B (PostgreSQL): the equivalent is INCLUDE in CREATE INDEX (v11+)
CREATE INDEX ix_orders_customer_covering ON orders(customer_id) INCLUDE (total);

Now the same query can be answered entirely from the index's leaf pages — an index-only scan — with no trip to the base table at all, because every column the query needs (customer_id to filter, total to return; id is implicitly available via the clustering key) is present in the index.

Key columns vs INCLUDE(d)/STORING columns

Putting total directly in the index key (Option A) makes it part of the sort order too, which enlarges the key, affects ORDER BY/range-scan usefulness, and duplicates it into every level of the B-tree. An INCLUDE/STORING clause (Option B) stores the extra column only at the leaf level, not in internal nodes, and doesn't affect sort order — generally the more efficient choice when the extra column is only needed for the SELECT list, not for filtering or sorting.

Caveats

  • Covering indexes trade write cost and storage for read speed — every additional column stored means more data duplicated and updated on every write. Don't cover every possible query; reserve this for genuinely hot, high-value queries.
  • SELECT * defeats covering indexes almost by definition — the engine can't predict which columns you'll need in the future, and a covering index for "all columns" is just... the whole table. Covering indexes work best paired with narrow, deliberate SELECT lists.
  • PostgreSQL's index-only scans additionally require the visibility map to confirm a page's rows are all visible to the current transaction — under heavy write/vacuum churn, PostgreSQL can silently fall back to a regular index scan (with key lookups) even when the index technically covers the query.