How does HashMap work internally?

11 minadvancedhashmapinternalshashing

Quick Answer

HashMap stores entries in an array of buckets. To put/get a key, it computes hashCode(), applies an internal hash-spreading function, and uses (hash & (capacity-1)) to pick a bucket. Each bucket holds a linked list of entries with that bucket index (or, since Java 8, a balanced tree if the bucket gets large), and equals() disambiguates entries within a bucket. The map resizes (doubles capacity, rehashes) once size exceeds capacity * loadFactor.

Detailed Answer

HashMap<K,V> is backed by an array of buckets (Node<K,V>[] table), where each bucket can hold multiple entries that hashed to the same slot.

Put/get, step by step:

  1. Compute key.hashCode(), then apply HashMap's internal spreading function (h ^ (h >>> 16)) to mix high bits into the low bits — this reduces collisions for keys whose hash codes differ mainly in upper bits.
  2. Compute the bucket index as hash & (table.length - 1) (equivalent to hash % capacity since capacity is always a power of two, but much faster).
  3. Within that bucket, walk the entries and use equals() to find a matching key (a get returns its value; a put with an existing key replaces it, otherwise a new entry is appended).

Collision handling: each bucket starts as a linked list of entries. Since Java 8, if a single bucket grows beyond a threshold (default 8) and the table is large enough, that bucket is converted into a balanced red-black tree, bounding worst-case lookup within a bucket from O(n) to O(log n) — this mainly guards against pathological hash collisions (e.g., maliciously crafted keys).

Resizing: the map tracks size vs. capacity * loadFactor (default load factor 0.75). Once size exceeds that threshold, capacity doubles and every entry is rehashed into the new, larger table — an O(n) operation, which is why specifying an expected capacity up front (new HashMap<>(expectedSize)) avoids repeated resizes in hot paths.

Requirements on keys: must have consistent hashCode()/equals() (per the contract) — mutating a key's hash-relevant fields after insertion "loses" the entry, since it will look in the wrong bucket on lookup.