What's the difference between a hash index and a B-tree index?

Detailed Answer

-- PostgreSQL: explicit hash index
CREATE INDEX ix_sessions_token_hash ON sessions USING HASH (token);

-- Default (B-tree) index
CREATE INDEX ix_sessions_token_btree ON sessions (token);

Hash index

Applies a hash function to the key and stores the entry in the corresponding bucket. Looking up WHERE token = 'abc123' computes the hash once and jumps directly to the bucket — average O(1), independent of table size.

Limitations:

Equality only. WHERE token = 'x' works; WHERE token > 'x', BETWEEN, ORDER BY token, or LIKE 'x%' cannot use a hash index at all, since hashing destroys any relationship between similar keys' storage locations.
No multi-column ordering benefit — a composite hash index doesn't support the same leading-prefix behavior a composite B-tree index does.
Historically (pre-PostgreSQL 10), hash indexes weren't even crash-safe/WAL-logged, which discouraged their use; this has since been fixed, but B-tree remains the overwhelmingly more common default across the industry regardless.

B-tree index

Maintains keys in sorted order (see the B-tree internals question), supporting equality, range, prefix, and sorted-retrieval queries — a strict superset of what a hash index can do, at the cost of O(log n) instead of O(1) average-case lookup.

Why B-tree wins by default almost everywhere

The performance difference between O(1) and O(log n) is negligible in practice — even a billion-row table has a B-tree height of only about 5-6 levels — while the functionality difference is large: almost every real query workload eventually needs a range scan, a sort, or a prefix match somewhere, which a hash index simply cannot provide. This is why MySQL's default index type (B-tree) and PostgreSQL's default (btree) both make B-tree the assumed choice unless you explicitly ask for something else, and why hash indexes are a niche optimization reserved for confirmed pure-equality workloads (e.g., a session-token lookup table where you genuinely never range-query or sort by the token).

When a hash index can still be worth it

If a specific column is queried only via exact-match equality, is large (long strings), and is under heavy lookup load, a hash index can offer a modest, measurable win — but this should be validated with benchmarking on your actual workload rather than assumed, since the B-tree's O(log n) is already extremely fast in absolute terms.

What's the difference between a hash index and a B-tree index?

Quick Answer

Detailed Answer

Hash index

B-tree index

Why B-tree wins by default almost everywhere

When a hash index can still be worth it

Related Resources

Related Questions