How does Python manage memory (reference counting and the garbage collector)?
Quick Answer
CPython's primary memory management is **reference counting**: every object tracks how many references point to it, and is freed immediately when that count hits zero. This can't reclaim **reference cycles** (objects referencing each other in a loop), so CPython adds a secondary **generational cyclic garbage collector** that periodically scans for and collects unreachable cycles the refcounter alone would miss.
Detailed Answer
Reference counting: the primary mechanism
import sys
a = [1, 2, 3]
sys.getrefcount(a) # 2 -- one for `a`, one for the getrefcount() argument itself
b = a # refcount incremented
sys.getrefcount(a) # 3
del b # refcount decremented
sys.getrefcount(a) # back to 2
del a # refcount hits 0 (ignoring the getrefcount call itself) -- freed IMMEDIATELY
Every object has a counter tracking how many references point to it.
Every new reference (assignment, passing as an argument, appending to a
container) increments it; every reference going away (del, reassignment,
falling out of scope) decrements it. The moment it hits zero, CPython
frees the object's memory immediately — deterministically, unlike
generational-only garbage collectors in some other languages.
The gap: reference cycles
class Node:
def __init__(self):
self.parent = None
self.child = None
a = Node()
b = Node()
a.child = b
b.parent = a # a references b, b references a -- a cycle
del a
del b
# a and b's refcounts never reach 0! each still holds one reference
# from the other -- refcounting ALONE can never free this cycle.
Even after both a and b go out of scope from the program's
perspective, they still reference each other, so neither refcount ever
reaches zero through refcounting alone — this is exactly the gap the
cyclic garbage collector exists to close.
The generational cyclic GC: the secondary mechanism
import gc
gc.collect() # force a collection cycle
gc.get_stats() # per-generation collection stats
CPython periodically runs a generational mark-and-sweep-style collector (three generations: 0, 1, 2) that specifically looks for groups of objects that reference each other but are unreachable from anything else in the program, and frees them together. New objects start in generation 0; objects that survive a collection are promoted to older generations, which are scanned less frequently (since long-lived objects are statistically less likely to become garbage soon) — this generational strategy keeps the overhead of cycle detection low for typical workloads.
Why this two-tier design
Reference counting alone is simple and gives instant, deterministic cleanup for the overwhelming majority of objects (no cycles involved), but can't handle cycles. A purely generational/tracing collector (like many other managed-memory languages use) can handle cycles but gives up deterministic, immediate cleanup. CPython's design gets the best of both: immediate cleanup for the common case, periodic cycle detection as a backstop for the rest.
Interview-ready summary: CPython frees most objects immediately via reference counting the moment their refcount hits zero; a supplementary generational garbage collector runs periodically to detect and free reference cycles that counting alone can never resolve, since a cycle's member objects each keep the others' refcounts above zero indefinitely.