What is the N+1 query problem, and how do you fix it?

Detailed Answer

The bug, illustrated

orders = db.query("SELECT * FROM orders WHERE customer_id = 42")  # 1 query, returns 20 orders

for order in orders:
    # This line, inside the loop, fires a SEPARATE query for EVERY order
    items = db.query(f"SELECT * FROM order_items WHERE order_id = {order.id}")

With 20 orders, this executes 1 + 20 = 21 queries total, when the same data could have been fetched in 2 queries (or even 1, with a join). The problem scales linearly with N — 1,000 orders means 1,001 queries — and each query pays its own network round-trip latency, which dominates the total time far more than the actual work the database does per query.

Why ORMs make this especially easy to write by accident

Most ORMs default to lazy loading for related entities — accessing order.items triggers a fresh query at the moment it's accessed, which is convenient and often invisible in code review (it just looks like a normal property access), but silently produces exactly this pattern the instant it happens inside a loop over a collection.

Fix 1: eager loading — fetch related data in one extra batched query

order_ids = [o.id for o in orders]
items = db.query(f"SELECT * FROM order_items WHERE order_id IN ({','.join(order_ids)})")
# group items by order_id in application code -- now just 2 total queries, regardless of N

Most ORMs provide a built-in mechanism for this: .include()/.Include() (Entity Framework), select_related/prefetch_related (Django), JOIN FETCH (JPA/Hibernate) — explicitly telling the ORM to fetch the related data as part of (or immediately following) the initial query, rather than lazily per-access.

Fix 2: a single JOIN query

SELECT o.id AS order_id, oi.product_id, oi.quantity
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
WHERE o.customer_id = 42;

One query total — the application groups the flattened join results back into a nested order/items structure. This avoids the extra round-trip of Fix 1's second query entirely, at the cost of potential row duplication if the "one" side of the join has data you'd otherwise only want once per order (see the join explosion question) — worth watching for if the query also aggregates.

How to catch this in practice

Query logging/APM tools (many ORMs and database drivers support logging every executed query) make N+1 patterns visible as a suspicious burst of near-identical queries differing only by an ID.
Load testing with realistic data volumes — N+1 is often invisible in development with a handful of test rows, and only becomes an obvious, painful problem once N is large in production.
Some ORMs (Django, for instance) have built-in tooling/warnings specifically designed to flag likely N+1 patterns during development.

This is one of the most common real-world performance bugs in application code backed by a database, and recognizing it (plus knowing the eager-loading fix by name in whatever ORM/stack is relevant) is a strong, practical signal of hands-on experience.