What's the difference between a generator expression and a list comprehension memory-wise?
Quick Answer
A list comprehension (`[x for x in y]`) builds the **entire list in memory immediately**. A generator expression (`(x for x in y)`) builds nothing upfront — it's a lazy iterator that computes each value on demand, using O(1) memory regardless of how many elements it will eventually produce. Use a generator expression when you'll only iterate once and don't need random access or the full sequence in memory at once.
Detailed Answer
Syntax difference: brackets vs parens
squares_list = [x * x for x in range(10**8)] # builds a 10^8-element list NOW
squares_gen = (x * x for x in range(10**8)) # builds nothing yet -- lazy
import sys
sys.getsizeof(squares_list) # ~800,000,000+ bytes -- huge
sys.getsizeof(squares_gen) # ~200 bytes -- constant, regardless of range size
The list comprehension allocates memory for every element immediately; the generator expression is a small object that knows how to produce values, computing each one only when asked.
When you can (and can't) use a generator expression
total = sum(x * x for x in range(10**8)) # fine -- sum() consumes lazily, one at a time
gen = (x * x for x in range(10**8))
gen[5] # TypeError -- generators don't support indexing
len(gen) # TypeError -- no len() either, since size isn't known upfront
list(gen) # works, but now you've paid the full memory cost anyway
for x in gen: ... # then a second `for x in gen: ...` produces NOTHING -- already exhausted
Generators trade away random access, len(), and re-iterability for
constant memory — appropriate when you consume the sequence exactly once,
in order, and don't need to know its length in advance.
The practical rule of thumb
- Feeding directly into another function that consumes one item at a
time (
sum(),max(),"".join(), aforloop): use a generator expression — no reason to materialize the whole list first. - Need to iterate multiple times, index into it, call
len(), or keep the whole thing around: use a list comprehension.
# Generator expression -- no unnecessary intermediate list
total = sum(price * qty for price, qty in cart)
# List comprehension -- needed multiple times / indexed
top_3 = sorted([score for score in scores])[-3:]
Function calls already act like generator expressions when parens are implied
sum(x * x for x in range(10)) # no extra parens needed -- single-argument call
When a generator expression is the sole argument to a function call, the
call's own parentheses double as the generator expression's parentheses —
you don't need sum((x * x for x in range(10))).
Interview-ready summary: List comprehensions build the full result eagerly, in memory; generator expressions produce values lazily, in O(1) memory, at the cost of single-pass, no-random-access consumption. Default to a generator expression whenever the result is consumed once, in sequence, by something else — reach for the list only when you need to keep, index, or re-iterate the full result.