How does Python manage memory (reference counting and the garbage collector)?

CPython's primary memory management is **reference counting**: every object tracks how many references point to it, and is freed immediately when that count hits zero. This can't reclaim **reference cycles** (objects referencing each other in a loop), so CPython adds a secondary **generational cyclic garbage collector** that periodically scans for and collects unreachable cycles the refcounter alone would miss.

What are reference cycles, and how does the garbage collector detect and break them?

A reference cycle is a group of objects referencing each other such that no object in the group has a refcount of zero, even though the group as a whole is unreachable from the rest of the program (e.g., a doubly-linked list, or a parent-child relationship with back-references). The generational GC detects cycles by tracing reachability from known "roots" (global/stack references) and collects any group of objects reachable only from within itself, running `__del__` where present before freeing them.

What is `weakref`, and when would you use it?

`weakref.ref(obj)` creates a reference to `obj` that **doesn't increment its refcount** — so it doesn't keep the object alive, and calling the weak reference returns `None` once the object is actually garbage collected. It's used to avoid reference cycles (e.g., parent/child back-pointers) and to build caches/registries that shouldn't themselves prevent an object from being freed when nothing else needs it.

How do you profile a Python program's performance?

Use `cProfile` (built-in, deterministic profiler) to find which functions consume the most cumulative/total time across a whole program run; use `timeit` for precise micro-benchmarks of a small code snippet; use a memory profiler (`memory_profiler`, `tracemalloc`) to find where memory is actually being allocated. Always profile before optimizing — intuition about where time goes is frequently wrong.

What are common causes of memory leaks in long-running Python processes?

Despite having garbage collection, Python processes can still leak memory: unbounded caches/global collections that grow forever, reference cycles involving objects with problematic `__del__` methods (rare on modern Python, but still possible to stall), circular references combined with C-extension objects that don't fully participate in Python's GC, event listeners/callbacks that are registered but never unregistered, and simply holding onto large objects (e.g., in a closure or a module-level list) longer than needed.

What's the difference between `sys.getsizeof()` and an object's actual total memory usage?

`sys.getsizeof(obj)` returns only the memory of the object **itself** — for a container like a list, that's the list structure and its internal pointer array, but **not** the size of the objects it points to. To get the true total memory footprint of a nested structure, you need to recursively sum the sizes of every object reachable from it (accounting for shared references so you don't double-count).

What are some practical techniques to optimize slow Python code?

In order of typical impact: fix algorithmic complexity first (an O(n²) loop beats any micro-optimization of an O(n) one); replace `list`-based membership checks with `set`/`dict`; push hot numeric loops into vectorized libraries (NumPy) or compiled extensions (Cython, Rust via PyO3); use built-in functions/comprehensions (implemented in C) over manual Python loops; cache expensive pure computations with `functools.lru_cache`; and only reach for micro-optimizations after profiling confirms where time is actually spent.

How does small-integer and string caching (interning) affect object identity?

CPython pre-allocates and caches all integers from **-5 to 256** at startup, so every reference to, say, `100` in that range points to the *same* cached object — making `is` comparisons on small integers appear to work by coincidence. Outside that range, each integer literal typically creates a distinct object. This is a CPython implementation detail (an optimization to avoid constantly allocating tiny, extremely common integer objects), not a language guarantee, so code must never rely on it.

Memory Management & Performance

Reference counting, the cyclic garbage collector, weak references, profiling, and optimization techniques.

Difficulty

Open as page

Reference counting: the primary mechanism

import sys

a = [1, 2, 3]
sys.getrefcount(a)   # 2 -- one for `a`, one for the getrefcount() argument itself

b = a                 # refcount incremented
sys.getrefcount(a)    # 3

del b                  # refcount decremented
sys.getrefcount(a)     # back to 2

del a                   # refcount hits 0 (ignoring the getrefcount call itself) -- freed IMMEDIATELY

Every object has a counter tracking how many references point to it. Every new reference (assignment, passing as an argument, appending to a container) increments it; every reference going away (del, reassignment, falling out of scope) decrements it. The moment it hits zero, CPython frees the object's memory immediately — deterministically, unlike generational-only garbage collectors in some other languages.

The gap: reference cycles

class Node:
    def __init__(self):
        self.parent = None
        self.child = None

a = Node()
b = Node()
a.child = b
b.parent = a          # a references b, b references a -- a cycle

del a
del b
# a and b's refcounts never reach 0! each still holds one reference
# from the other -- refcounting ALONE can never free this cycle.

Even after both a and b go out of scope from the program's perspective, they still reference each other, so neither refcount ever reaches zero through refcounting alone — this is exactly the gap the cyclic garbage collector exists to close.

The generational cyclic GC: the secondary mechanism

import gc

gc.collect()          # force a collection cycle
gc.get_stats()         # per-generation collection stats

CPython periodically runs a generational mark-and-sweep-style collector (three generations: 0, 1, 2) that specifically looks for groups of objects that reference each other but are unreachable from anything else in the program, and frees them together. New objects start in generation 0; objects that survive a collection are promoted to older generations, which are scanned less frequently (since long-lived objects are statistically less likely to become garbage soon) — this generational strategy keeps the overhead of cycle detection low for typical workloads.

Why this two-tier design

Reference counting alone is simple and gives instant, deterministic cleanup for the overwhelming majority of objects (no cycles involved), but can't handle cycles. A purely generational/tracing collector (like many other managed-memory languages use) can handle cycles but gives up deterministic, immediate cleanup. CPython's design gets the best of both: immediate cleanup for the common case, periodic cycle detection as a backstop for the rest.

Interview-ready summary: CPython frees most objects immediately via reference counting the moment their refcount hits zero; a supplementary generational garbage collector runs periodically to detect and free reference cycles that counting alone can never resolve, since a cycle's member objects each keep the others' refcounts above zero indefinitely.

Related Resources

gc — Garbage Collector interface

Open as page

A classic real-world cycle: parent/child back-references

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.children = []
        self.parent = None

    def add_child(self, child):
        child.parent = self       # child references parent
        self.children.append(child)  # parent references child -- cycle!

root = TreeNode("root")
child = TreeNode("child")
root.add_child(child)

del root
del child
# neither node's refcount reaches 0 -- each still holds a reference to the other
# via .parent / .children, even though nothing external references either anymore

This pattern — a container holding items that reference back to the container — appears constantly in real code (trees, doubly-linked lists, observer patterns, cached objects referencing a registry that references them back) and is the main source of reference cycles in practice.

How the collector finds cycles: reachability from roots

The GC periodically walks the object graph starting from roots (global variables, active stack frames' local variables, and other objects the interpreter directly holds), marking everything reachable. Anything not reachable from a root — even if its internal refcount is nonzero because of cycle members referencing each other — is garbage and gets collected. This is conceptually a standard "trial deletion" / mark-and-sweep-style algorithm applied only to container objects (the GC only needs to track objects that could participate in cycles — lists, dicts, class instances — not simple immutable values like int/str).

`del` and cycles: a historical gotcha

Before Python 3.4, objects with a __del__ method involved in a cycle could not be collected at all (the collector didn't know a safe order to call __del__ on cyclic objects, so it left them as permanent "uncollectable garbage" in gc.garbage). Since Python 3.4 (PEP 442), the collector can safely collect cycles containing __del__ methods too, finalizing them in a safe order — this was a real, previously-documented memory leak source in older Python code that's no longer a concern on modern versions.

Practical mitigations

import weakref

class TreeNode:
    def add_child(self, child):
        child.parent = weakref.ref(self)   # doesn't keep self alive; breaks the cycle
        self.children.append(child)

Using weakref for back-references (parent pointers, cache entries) avoids creating a cycle in the first place, letting reference counting alone reclaim the objects immediately rather than waiting for a periodic GC pass — useful for large numbers of short-lived cyclic structures where GC pause overhead matters.

Interview-ready summary: Reference cycles are groups of objects that keep each other's refcounts alive even when unreachable from the rest of the program — common in parent/child and cache-style back-references. The generational GC detects them by tracing reachability from program roots, and (since Python 3.4) can safely collect cycles even when __del__ methods are involved; using weakref for back-references avoids creating the cycle at all.

Related Resources

gc — Garbage Collector interface

Open as page

Basic usage: a reference that doesn't keep the object alive

import weakref

class Resource:
    def __init__(self, name):
        self.name = name

r = Resource("db-connection")
weak = weakref.ref(r)

weak()          # <Resource object> -- still alive, call it to get the real object (or None)

del r            # the only STRONG reference is gone
weak()            # None -- the object was actually collected; weakref doesn't keep it alive

Unlike a normal reference (weak = r, which increments the refcount), weakref.ref(r) doesn't — so it has no say in whether r stays alive. Calling weak() returns the live object while it exists, or None once it's actually been collected.

Use case 1: caches that don't prevent eviction

import weakref

_cache = weakref.WeakValueDictionary()

def get_resource(key):
    resource = _cache.get(key)
    if resource is None:
        resource = load_expensive_resource(key)
        _cache[key] = resource
    return resource

WeakValueDictionary holds weak references to its values — entries are automatically removed once nothing else references the value. This gives you caching ("reuse the object if something else is still using it") without the cache itself becoming the reason large objects never get freed (a plain dict-based cache would keep every entry alive forever, a common source of memory leaks in long-running processes).

Use case 2: breaking reference cycles (parent/child back-pointers)

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.children = []
        self._parent_ref = None

    @property
    def parent(self):
        return self._parent_ref() if self._parent_ref else None

    def add_child(self, child):
        child._parent_ref = weakref.ref(self)
        self.children.append(child)

The child's reference to its parent doesn't keep the parent alive — only the parent's children list (a strong reference, appropriately, since a parent should keep its children alive) does. This avoids creating a cycle at all, so refcounting alone can free the tree immediately once the root goes out of scope, rather than waiting for a periodic GC cycle pass.

Use case 3: observer patterns

class EventBus:
    def __init__(self):
        self._listeners = weakref.WeakSet()

    def subscribe(self, listener):
        self._listeners.add(listener)

Listeners registered with the bus don't have their lifetime extended just because they subscribed — if a listener object is otherwise no longer needed, it can still be garbage collected, and the WeakSet automatically drops the now-dead reference rather than leaking it indefinitely.

The limitation: not every object supports weak references

weakref.ref(42)   # TypeError: cannot create weak reference to 'int' object

Some built-in types (int, str, tuple in some cases) don't support weak references directly without wrapping — mainly relevant for custom classes (which support it by default unless __slots__ excludes __weakref__) and specific container types designed for this purpose (WeakValueDictionary, WeakKeyDictionary, WeakSet).

Interview-ready summary: weakref creates references that don't extend an object's lifetime, used for caches that shouldn't prevent eviction (WeakValueDictionary) and for breaking reference cycles in back-pointer relationships (parent/child, observer patterns) — letting reference counting reclaim memory immediately instead of relying on the periodic cyclic garbage collector.

Related Resources

weakref — Python docs

Open as page

`cProfile`: whole-program, function-level profiling

import cProfile
import pstats

def slow_function():
    return sum(i * i for i in range(10**6))

cProfile.run("slow_function()", "output.prof")

stats = pstats.Stats("output.prof")
stats.sort_stats("cumulative").print_stats(10)   # top 10 by cumulative time

python -m cProfile -s cumulative my_script.py   # profile a whole script from the CLI

cProfile instruments every function call, reporting ncalls (call count), tottime (time in the function itself, excluding sub-calls), and cumtime (time including everything it called) — the standard first step for "where is my program actually spending time," which is often surprising compared to where you assumed the bottleneck was.

`timeit`: precise micro-benchmarks

import timeit

timeit.timeit("[x**2 for x in range(1000)]", number=10000)
timeit.timeit("list(map(lambda x: x**2, range(1000)))", number=10000)

python -m timeit -s "data = list(range(1000))" "sorted(data)"

timeit runs a snippet many times in a controlled environment (disabling the garbage collector during timing by default, to avoid GC pauses skewing results), giving a reliable comparison between two small alternative implementations — the right tool for "is approach A or B faster," as opposed to cProfile's "where does the whole program's time go."

Memory profiling: `tracemalloc` (built-in)

import tracemalloc

tracemalloc.start()
run_program()
snapshot = tracemalloc.take_snapshot()

for stat in snapshot.statistics("lineno")[:10]:
    print(stat)   # top 10 lines by memory allocated, with file/line info

tracemalloc (built into the standard library since 3.4) tracks memory allocations by source location, letting you find exactly which lines are responsible for the most memory use — invaluable for tracking down unexpected memory growth or leaks in long-running processes.

Line-level profiling: `line_profiler` (third-party)

# pip install line_profiler
@profile   # applied via `kernprof -l script.py`, not a normal decorator
def slow_function():
    ...

When cProfile identifies a hot function but you need to know which specific line inside it is slow, line_profiler gives per-line timing — more granular than cProfile's per-function view, at the cost of needing a separate tool and higher overhead while running.

The discipline: measure before optimizing

The universal rule across all of this: profile first, then optimize the actual bottleneck — intuitions about "the slow part" are frequently wrong (a startup-time import, an accidentally-quadratic loop, or excessive small allocations often dominate over the code a developer assumed was slow), and optimizing the wrong part wastes effort while adding complexity for no measured benefit.

Interview-ready summary: cProfile finds which functions dominate a whole program's runtime; timeit precisely compares small snippets; tracemalloc/memory_profiler locate where memory is actually allocated. The overarching principle is to profile first — intuition about bottlenecks is unreliable, and optimization effort should follow measured data, not guesses.

Related Resources

cProfile — Python docs

timeit — Python docs

Open as page

Cause 1: unbounded caches and module-level collections

_cache = {}

def get_data(key):
    if key not in _cache:
        _cache[key] = expensive_computation(key)
    return _cache[key]     # _cache grows forever -- every unique key leaks memory permanently

The single most common "leak" in Python is entirely mundane: a global dict/list that accumulates entries with no eviction policy. This isn't a GC bug at all — the objects are genuinely still reachable (via _cache), so it's working exactly as designed; the fix is a bounded cache (functools.lru_cache(maxsize=...), a WeakValueDictionary, or explicit TTL/eviction logic).

Cause 2: registered callbacks/listeners never unregistered

class EventBus:
    def __init__(self):
        self.listeners = []

    def subscribe(self, callback):
        self.listeners.append(callback)   # strong reference -- keeps callback's owner alive!

bus = EventBus()

class Widget:
    def __init__(self, bus):
        bus.subscribe(self.on_event)   # bus now holds a reference to this Widget forever

    def on_event(self, event):
        ...

Every Widget that subscribes is kept alive by the bus indefinitely, even after the code that created it no longer needs it — since bound methods (self.on_event) carry a reference to self. Fixes: explicit unsubscribe() calls at the end of an object's lifecycle, or storing listeners as weakref.WeakMethod/in a WeakSet so subscribing doesn't extend the subscriber's lifetime.

Cause 3: reference cycles with C extension objects

Pure-Python reference cycles are eventually collected by the cyclic GC (see the reference-cycles question), but cycles that include objects managed partly by a C extension that doesn't fully cooperate with Python's GC protocol (rare with well-behaved extensions, but a known historical source of leaks in some older/poorly-written bindings) can sometimes never be collected — worth knowing as a "when all else fails, suspect the C extension" debugging lead.

Cause 4: closures/threads unintentionally keeping large objects alive

def process_large_dataset():
    data = load_huge_dataset()          # large object
    def callback():
        return len(data)                  # closure captures `data`, keeping it alive
    register_callback(callback)             # callback (and therefore `data`) outlives this function
    return "done"                           # caller assumes `data` is now freeable -- it isn't!

A closure captures whatever variables it references from its enclosing scope — if that closure is stored somewhere long-lived (a callback registry, a class attribute), it keeps everything it captured alive too, even data the closure barely uses.

Diagnosing a suspected leak

import gc, tracemalloc

tracemalloc.start()
# ... run workload, take snapshots at intervals ...
snapshot1 = tracemalloc.take_snapshot()
# ... more workload ...
snapshot2 = tracemalloc.take_snapshot()
for stat in snapshot2.compare_to(snapshot1, "lineno")[:10]:
    print(stat)   # which lines allocated the most NEW memory between snapshots

gc.collect()
len(gc.garbage)   # non-empty -- objects the GC found uncollectable (rare on modern Python)

Comparing tracemalloc snapshots over time pinpoints exactly which allocation sites are growing unboundedly, which is almost always more productive than guessing.

Interview-ready summary: Most "memory leaks" in Python aren't GC failures at all — they're objects that are still genuinely reachable: unbounded caches, forgotten event-listener registrations, and closures capturing large objects longer than intended. Diagnose with tracemalloc snapshot comparisons rather than assuming the garbage collector itself is at fault.

Related Resources

gc — Garbage Collector interface

Open as page

The trap: `getsizeof` doesn't recurse into contents

import sys

small_list = [1, 2, 3]
big_list = [10**100] * 3   # each element is a HUGE integer

sys.getsizeof(small_list)   # ~88 bytes -- just the list structure + 3 pointers
sys.getsizeof(big_list)      # ~88 bytes -- SAME! getsizeof doesn't look at what's IN the list

Both lists report roughly the same size from getsizeof, because it only measures the list object's own overhead (a header plus an array of pointers) — not the memory used by the objects those pointers point to. The actual memory difference between small_list and big_list (the huge integers) is completely invisible to a naive getsizeof call.

Getting the real total: recursive sizing

import sys
from itertools import chain
from collections import deque

def total_size(obj, seen=None):
    if seen is None:
        seen = set()
    obj_id = id(obj)
    if obj_id in seen:      # avoid double-counting shared references / infinite recursion on cycles
        return 0
    seen.add(obj_id)

    size = sys.getsizeof(obj)
    if isinstance(obj, dict):
        size += sum(total_size(k, seen) + total_size(v, seen) for k, v in obj.items())
    elif isinstance(obj, (list, tuple, set, frozenset)):
        size += sum(total_size(item, seen) for item in obj)
    return size

total_size(big_list)   # now correctly reflects the huge integers' actual size

This is roughly what third-party tools like pympler.asizeof do more robustly — a naive recursive walk must track already-visited object ids (seen) both to avoid infinite loops on cyclic structures and to avoid double-counting objects referenced from multiple places (e.g., the same string appearing as a value under several dict keys).

Why this distinction matters in practice

data = {"key": some_shared_large_object}
data2 = {"other_key": some_shared_large_object}   # same object, referenced twice

sys.getsizeof(data) + sys.getsizeof(data2)   # double-counts some_shared_large_object
                                                # if you naively try to add up "total memory"

Any back-of-envelope memory estimate built by summing getsizeof calls across multiple containers risks double-counting shared objects — the same underlying string/list/object referenced from two places isn't actually taking up memory twice, but naive summation would report it that way.

Practical guidance

For a real memory audit, prefer tracemalloc (tracks actual allocations by source location, correctly reflecting real memory pressure) or a purpose-built tool (pympler, objgraph) over hand-rolled sys.getsizeof recursion — the latter is useful for quick, one-off "how big is this specific object's own overhead" checks, not accurate whole-structure memory accounting.

Interview-ready summary: sys.getsizeof measures only an object's own shallow overhead, not what it references — a container full of huge objects can report the same size as one full of tiny objects. Getting a true total requires recursively walking references while tracking already-visited objects to avoid double-counting shared data, or better, using a dedicated memory-profiling tool.

Related Resources

sys.getsizeof — Python docs

Open as page

1. Fix algorithmic complexity before anything else

# O(n^2) -- checking membership in a list, inside a loop over another list
duplicates = [x for x in list_a if x in list_b]        # list_b scanned linearly, every time

# O(n) -- convert to a set once, then O(1) average membership checks
set_b = set(list_b)
duplicates = [x for x in list_a if x in set_b]

No amount of micro-optimizing the O(n²) version beats simply removing the quadratic behavior — this single change (converting a repeatedly-scanned list into a set) is often the highest-leverage optimization available in real code.

2. Prefer built-ins and comprehensions over manual Python loops

# Slower -- Python-level loop overhead on every iteration
total = 0
for x in numbers:
    total += x * x

# Faster -- sum()/comprehension push the loop into C
total = sum(x * x for x in numbers)

Built-in functions (sum, map, sorted, any, all) and comprehensions are implemented in C and avoid the per-iteration overhead of the Python bytecode interpreter loop — a consistent, easy win whenever applicable.

3. Cache expensive, pure computations

from functools import lru_cache

@lru_cache(maxsize=None)
def expensive_pure_function(n):
    ...

Memoization turns repeated calls with the same arguments from full recomputation into O(1) lookups — a large win whenever a pure function is called repeatedly with overlapping arguments (see the lru_cache question for caveats).

4. Push numeric/hot loops into vectorized or compiled code

# Pure Python loop -- slow for large arrays
result = [x * 2 + 1 for x in large_list]

# NumPy -- the actual loop runs in C, operating on contiguous memory
import numpy as np
arr = np.array(large_list)
result = arr * 2 + 1

For numeric workloads, NumPy's vectorized operations (and, for more custom logic, Cython or a Rust extension via PyO3) can be one to two orders of magnitude faster than an equivalent pure-Python loop, since they avoid both the per-element interpreter overhead and (for NumPy) operate on tightly-packed memory instead of a list of individually boxed Python objects.

5. Avoid unnecessary object creation in hot paths

# Builds an intermediate list just to throw it away
if [x for x in items if x.active]:   # wasteful -- builds the whole list just to check truthiness
    ...

# any() short-circuits on the first match -- no full list built
if any(x.active for x in items):
    ...

Small allocation-avoidance changes (generator expressions over list comprehensions when the full list isn't needed, str.join() over repeated += string concatenation in a loop) add up in hot code paths.

6. Always profile before and after

Every technique above should be validated against a cProfile/timeit measurement of the actual bottleneck — optimizing code that isn't on the hot path adds complexity and risk for no measurable benefit.

Interview-ready summary: Algorithmic complexity dominates all other optimizations — fix O(n²) patterns first. After that, prefer built-ins/ vectorized operations over manual Python loops, cache pure expensive computations, and avoid unnecessary intermediate allocations — but always let profiling data, not intuition, decide where to spend optimization effort.

Related Resources

Performance Tips — Python wiki

Open as page

The small-integer cache in action

a = 100
b = 100
a is b   # True -- both point to the SAME cached int object (100 is within -5..256)

x = 1000
y = 1000
x is y    # False on most CPython builds/contexts -- distinct objects; no caching guarantee

CPython pre-creates integer objects for -5 through 256 once, at interpreter startup, since these small values are used constantly throughout any program (loop counters, small offsets, boolean-like flags) — reusing one cached object instead of allocating a fresh one for every occurrence is a straightforward, safe optimization because integers are immutable, so sharing the same object across unrelated code has no observable effect other than saving allocations.

Why this is invisible and safe for `==` but a trap for `is`

def is_positive(n):
    return n > 0

is_positive(100) == is_positive(100)   # True -- correct, always
100 is 100                               # True, but ONLY because of the small-int cache

n = 1000
n is 1000    # unreliable! don't write code that depends on this

Since == compares values (correct regardless of caching), the cache is completely invisible to correct code. It only becomes a trap when someone mistakenly uses is for value comparison and it happens to "work" during testing (because test values were small) but silently breaks in production once real values exceed 256 — this is exactly why modern CPython emits a SyntaxWarning for is used with integer/string literals.

String interning follows a similar but separate policy

s1 = "hello"
s2 = "hello"
s1 is s2   # True -- identifier-like string literals are commonly interned

s3 = "".join(["hel", "lo"])
s3 is s1     # often False -- built at runtime, not necessarily interned

String interning (covered in more depth in the Collections topic) is a related but distinct CPython optimization with its own rules about which strings get cached — both mechanisms exist purely to reduce memory/ allocation overhead for extremely common immutable values, and neither is part of the language specification.

The takeaway for interview answers

The important part isn't memorizing the exact cache range — it's recognizing that this is a CPython implementation detail that other implementations (PyPy, etc.) and even future CPython versions are free to change, so relying on is for numeric or string value comparison is a latent bug, not a valid optimization. Always use == for value comparisons; reserve is for identity checks that are genuinely about identity (None, sentinels, singleton checks).

Interview-ready summary: CPython caches small integers (-5 to 256) and many identifier-like string literals as an allocation-saving optimization, which makes is comparisons on them appear to work — but this is an implementation detail with no language guarantee, and relying on it for value equality (instead of ==) is a bug that will eventually surface once values fall outside the cached range.

Related Resources

Data model — Python docs

Memory Management & Performance

How does Python manage memory (reference counting and the garbage collector)?

Reference counting: the primary mechanism

The gap: reference cycles

The generational cyclic GC: the secondary mechanism

Why this two-tier design

Related Resources

What are reference cycles, and how does the garbage collector detect and break them?

A classic real-world cycle: parent/child back-references

How the collector finds cycles: reachability from roots

__del__ and cycles: a historical gotcha

Practical mitigations

Related Resources

What is `weakref`, and when would you use it?

Basic usage: a reference that doesn't keep the object alive

Use case 1: caches that don't prevent eviction

Use case 2: breaking reference cycles (parent/child back-pointers)

Use case 3: observer patterns

The limitation: not every object supports weak references

Related Resources

How do you profile a Python program's performance?

cProfile: whole-program, function-level profiling

timeit: precise micro-benchmarks

Memory profiling: tracemalloc (built-in)

Line-level profiling: line_profiler (third-party)

The discipline: measure before optimizing

Related Resources

What are common causes of memory leaks in long-running Python processes?

Cause 1: unbounded caches and module-level collections

Cause 2: registered callbacks/listeners never unregistered

Cause 3: reference cycles with C extension objects

Cause 4: closures/threads unintentionally keeping large objects alive

Diagnosing a suspected leak

Related Resources

What's the difference between `sys.getsizeof()` and an object's actual total memory usage?

The trap: getsizeof doesn't recurse into contents

Getting the real total: recursive sizing

Why this distinction matters in practice

Practical guidance

Related Resources

What are some practical techniques to optimize slow Python code?

1. Fix algorithmic complexity before anything else

2. Prefer built-ins and comprehensions over manual Python loops

3. Cache expensive, pure computations

4. Push numeric/hot loops into vectorized or compiled code

5. Avoid unnecessary object creation in hot paths

6. Always profile before and after

Related Resources

How does small-integer and string caching (interning) affect object identity?

The small-integer cache in action

Why this is invisible and safe for == but a trap for is

String interning follows a similar but separate policy

The takeaway for interview answers

Related Resources

`del` and cycles: a historical gotcha

`cProfile`: whole-program, function-level profiling

`timeit`: precise micro-benchmarks

Memory profiling: `tracemalloc` (built-in)

Line-level profiling: `line_profiler` (third-party)

The trap: `getsizeof` doesn't recurse into contents

Why this is invisible and safe for `==` but a trap for `is`