What is the N+1 query problem, and how do you fix it?
Quick Answer
The N+1 problem happens when code fetches a list of N parent records with one query, then issues a separate query per parent to fetch related child data — resulting in 1 + N total queries instead of 2 (or 1). It's extremely common with ORMs' lazy-loading defaults. The fix is to fetch the related data in one additional batched query (or a single join), commonly called eager loading, instead of triggering a query inside a loop.
Detailed Answer
The bug, illustrated
orders = db.query("SELECT * FROM orders WHERE customer_id = 42") # 1 query, returns 20 orders
for order in orders:
# This line, inside the loop, fires a SEPARATE query for EVERY order
items = db.query(f"SELECT * FROM order_items WHERE order_id = {order.id}")
With 20 orders, this executes 1 + 20 = 21 queries total, when the same data could have been fetched in 2 queries (or even 1, with a join). The problem scales linearly with N — 1,000 orders means 1,001 queries — and each query pays its own network round-trip latency, which dominates the total time far more than the actual work the database does per query.
Why ORMs make this especially easy to write by accident
Most ORMs default to lazy loading for related entities — accessing order.items triggers a fresh query at the moment it's accessed, which is convenient and often invisible in code review (it just looks like a normal property access), but silently produces exactly this pattern the instant it happens inside a loop over a collection.
Fix 1: eager loading — fetch related data in one extra batched query
order_ids = [o.id for o in orders]
items = db.query(f"SELECT * FROM order_items WHERE order_id IN ({','.join(order_ids)})")
# group items by order_id in application code -- now just 2 total queries, regardless of N
Most ORMs provide a built-in mechanism for this: .include()/.Include() (Entity Framework), select_related/prefetch_related (Django), JOIN FETCH (JPA/Hibernate) — explicitly telling the ORM to fetch the related data as part of (or immediately following) the initial query, rather than lazily per-access.
Fix 2: a single JOIN query
SELECT o.id AS order_id, oi.product_id, oi.quantity
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
WHERE o.customer_id = 42;
One query total — the application groups the flattened join results back into a nested order/items structure. This avoids the extra round-trip of Fix 1's second query entirely, at the cost of potential row duplication if the "one" side of the join has data you'd otherwise only want once per order (see the join explosion question) — worth watching for if the query also aggregates.
How to catch this in practice
- Query logging/APM tools (many ORMs and database drivers support logging every executed query) make N+1 patterns visible as a suspicious burst of near-identical queries differing only by an ID.
- Load testing with realistic data volumes — N+1 is often invisible in development with a handful of test rows, and only becomes an obvious, painful problem once N is large in production.
- Some ORMs (Django, for instance) have built-in tooling/warnings specifically designed to flag likely N+1 patterns during development.
This is one of the most common real-world performance bugs in application code backed by a database, and recognizing it (plus knowing the eager-loading fix by name in whatever ORM/stack is relevant) is a strong, practical signal of hands-on experience.