What are the most useful `itertools` functions, and when would you use them?
Quick Answer
`itertools` provides fast, memory-efficient building blocks for iterator composition: `chain` (concatenate iterables), `groupby` (group consecutive equal keys), `islice` (slice an iterator lazily), `product`/`permutations`/`combinations` (combinatorics), `count`/`cycle`/`repeat` (infinite iterators), and `tee` (split one iterator into several independent ones). They compose well with generators to build data pipelines without materializing intermediate lists.
Detailed Answer
The everyday workhorses
from itertools import chain, islice, groupby, product, count
# chain -- concatenate multiple iterables lazily, no copying
list(chain([1, 2], [3, 4], [5])) # [1, 2, 3, 4, 5]
# islice -- lazily slice an iterator (regular slicing doesn't work on iterators!)
list(islice(range(100), 5, 10)) # [5, 6, 7, 8, 9]
first_3 = islice(open("huge.log"), 3) # first 3 lines, without reading the whole file
# groupby -- group CONSECUTIVE elements sharing a key (input must be pre-sorted/grouped!)
data = [("a", 1), ("a", 2), ("b", 3), ("b", 4), ("a", 5)]
for key, group in groupby(data, key=lambda x: x[0]):
print(key, list(group))
# a [('a', 1), ('a', 2)]
# b [('b', 3), ('b', 4)]
# a [('a', 5)] <- note: a separate group, since input wasn't fully sorted by key
The groupby gotcha is one of the most common itertools mistakes:
it only groups consecutive matching elements, so input almost always
needs sorted(data, key=...) applied first if you want all occurrences
of each key grouped together.
Infinite iterators (paired with islice/takewhile to bound them)
from itertools import count, cycle, repeat, islice
list(islice(count(10, 2), 5)) # [10, 12, 14, 16, 18] -- count from 10, step 2
list(islice(cycle("AB"), 5)) # ['A', 'B', 'A', 'B', 'A'] -- repeats forever
list(repeat("x", 3)) # ['x', 'x', 'x']
count/cycle never terminate on their own — always pair them with
islice, zip against a finite iterable, or a break condition.
Combinatorics
from itertools import product, permutations, combinations
list(product([1, 2], ["a", "b"])) # [(1,'a'), (1,'b'), (2,'a'), (2,'b')] -- cartesian product
list(permutations([1, 2, 3], 2)) # [(1,2),(1,3),(2,1),(2,3),(3,1),(3,2)]
list(combinations([1, 2, 3], 2)) # [(1,2),(1,3),(2,3)] -- order doesn't matter
These replace hand-written nested loops for generating all pairings,
orderings, or subsets — both more concise and more efficient than manual
nested for loops.
tee: splitting one iterator into several
from itertools import tee
a, b = tee(some_generator, 2)
list(a) # consumes the shared underlying iterator
list(b) # still works -- tee buffers what a already consumed
Useful when two different consumers each need to walk the same iterator
independently — but note tee buffers data internally, so it trades
memory for that independence and shouldn't replace list() for small,
reusable sequences.
Interview-ready summary: itertools provides composable, lazy
building blocks — chain/islice for combining/slicing, groupby for
grouping consecutive keys (remember to sort first), product/
permutations/combinations for combinatorics, and count/cycle for
infinite sequences bounded by islice. They let you build multi-step
data pipelines without materializing intermediate lists at each stage.