What's the difference between a hash index and a B-tree index?
Quick Answer
A **hash index** maps a hashed key directly to a bucket location, giving O(1) average-case equality lookups, but has no concept of order — it can't support range queries, sorting, or prefix matching. A **B-tree index** maintains sorted order, giving O(log n) lookups but also supporting ranges, sorting, and prefix matches. Most databases default to B-tree indexes because they cover a much broader set of query patterns; hash indexes are a narrower optimization for pure equality-lookup workloads.
Detailed Answer
-- PostgreSQL: explicit hash index
CREATE INDEX ix_sessions_token_hash ON sessions USING HASH (token);
-- Default (B-tree) index
CREATE INDEX ix_sessions_token_btree ON sessions (token);
Hash index
Applies a hash function to the key and stores the entry in the corresponding bucket. Looking up WHERE token = 'abc123' computes the hash once and jumps directly to the bucket — average O(1), independent of table size.
Limitations:
- Equality only.
WHERE token = 'x'works;WHERE token > 'x',BETWEEN,ORDER BY token, orLIKE 'x%'cannot use a hash index at all, since hashing destroys any relationship between similar keys' storage locations. - No multi-column ordering benefit — a composite hash index doesn't support the same leading-prefix behavior a composite B-tree index does.
- Historically (pre-PostgreSQL 10), hash indexes weren't even crash-safe/WAL-logged, which discouraged their use; this has since been fixed, but B-tree remains the overwhelmingly more common default across the industry regardless.
B-tree index
Maintains keys in sorted order (see the B-tree internals question), supporting equality, range, prefix, and sorted-retrieval queries — a strict superset of what a hash index can do, at the cost of O(log n) instead of O(1) average-case lookup.
Why B-tree wins by default almost everywhere
The performance difference between O(1) and O(log n) is negligible in practice — even a billion-row table has a B-tree height of only about 5-6 levels — while the functionality difference is large: almost every real query workload eventually needs a range scan, a sort, or a prefix match somewhere, which a hash index simply cannot provide. This is why MySQL's default index type (B-tree) and PostgreSQL's default (btree) both make B-tree the assumed choice unless you explicitly ask for something else, and why hash indexes are a niche optimization reserved for confirmed pure-equality workloads (e.g., a session-token lookup table where you genuinely never range-query or sort by the token).
When a hash index can still be worth it
If a specific column is queried only via exact-match equality, is large (long strings), and is under heavy lookup load, a hash index can offer a modest, measurable win — but this should be validated with benchmarking on your actual workload rather than assumed, since the B-tree's O(log n) is already extremely fast in absolute terms.