What's the difference between a generator expression and a list comprehension memory-wise?

5 minintermediategeneratorscomprehensionsmemory

Quick Answer

A list comprehension (`[x for x in y]`) builds the **entire list in memory immediately**. A generator expression (`(x for x in y)`) builds nothing upfront — it's a lazy iterator that computes each value on demand, using O(1) memory regardless of how many elements it will eventually produce. Use a generator expression when you'll only iterate once and don't need random access or the full sequence in memory at once.

Detailed Answer

Syntax difference: brackets vs parens

squares_list = [x * x for x in range(10**8)]    # builds a 10^8-element list NOW
squares_gen = (x * x for x in range(10**8))      # builds nothing yet -- lazy

import sys
sys.getsizeof(squares_list)   # ~800,000,000+ bytes -- huge
sys.getsizeof(squares_gen)     # ~200 bytes -- constant, regardless of range size

The list comprehension allocates memory for every element immediately; the generator expression is a small object that knows how to produce values, computing each one only when asked.

When you can (and can't) use a generator expression

total = sum(x * x for x in range(10**8))    # fine -- sum() consumes lazily, one at a time
gen = (x * x for x in range(10**8))
gen[5]        # TypeError -- generators don't support indexing
len(gen)       # TypeError -- no len() either, since size isn't known upfront
list(gen)       # works, but now you've paid the full memory cost anyway
for x in gen: ...  # then a second `for x in gen: ...` produces NOTHING -- already exhausted

Generators trade away random access, len(), and re-iterability for constant memory — appropriate when you consume the sequence exactly once, in order, and don't need to know its length in advance.

The practical rule of thumb

  • Feeding directly into another function that consumes one item at a time (sum(), max(), "".join(), a for loop): use a generator expression — no reason to materialize the whole list first.
  • Need to iterate multiple times, index into it, call len(), or keep the whole thing around: use a list comprehension.
# Generator expression -- no unnecessary intermediate list
total = sum(price * qty for price, qty in cart)

# List comprehension -- needed multiple times / indexed
top_3 = sorted([score for score in scores])[-3:]

Function calls already act like generator expressions when parens are implied

sum(x * x for x in range(10))   # no extra parens needed -- single-argument call

When a generator expression is the sole argument to a function call, the call's own parentheses double as the generator expression's parentheses — you don't need sum((x * x for x in range(10))).

Interview-ready summary: List comprehensions build the full result eagerly, in memory; generator expressions produce values lazily, in O(1) memory, at the cost of single-pass, no-random-access consumption. Default to a generator expression whenever the result is consumed once, in sequence, by something else — reach for the list only when you need to keep, index, or re-iterate the full result.