What do LAG() and LEAD() do, and what problems do they solve?

5 minintermediatewindow-functionslaglead

Quick Answer

`LAG(column, n)` returns the value of `column` from `n` rows *before* the current row (within the window's partition/order); `LEAD(column, n)` returns the value from `n` rows *after*. Both default to `n = 1` and accept an optional default value for when there's no such row (e.g., the first row has no previous row). They solve row-to-row comparison problems — period-over-period change, detecting gaps, sequencing — without a self-join.

Detailed Answer

Basic usage

SELECT
    sale_date,
    amount,
    LAG(amount, 1) OVER (ORDER BY sale_date) AS prev_day_amount,
    LEAD(amount, 1) OVER (ORDER BY sale_date) AS next_day_amount
FROM daily_sales;
sale_dateamountprev_day_amountnext_day_amount
Jan 1100NULL120
Jan 212010090
Jan 390120NULL

The first row's prev_day_amount and the last row's next_day_amount are NULL by default, since there's no row before/after them — you can supply an explicit default instead: LAG(amount, 1, 0) OVER (...).

Common use case: period-over-period change

SELECT
    sale_date,
    amount,
    amount - LAG(amount) OVER (ORDER BY sale_date) AS change_from_yesterday,
    ROUND(100.0 * (amount - LAG(amount) OVER (ORDER BY sale_date))
          / NULLIF(LAG(amount) OVER (ORDER BY sale_date), 0), 1) AS pct_change
FROM daily_sales;

This is an extremely common reporting/dashboard pattern ("day-over-day," "month-over-month" change) and, before window functions, required a self-join on date = date - 1 (see the self-join question) — much more awkward, and fragile if there are gaps in the dates (a self-join on exact date offsets silently produces NULL/missing rows for any gap, whereas LAG always looks at the actual previous row, regardless of any gap in the underlying dates).

Per-partition usage: compare within a group

SELECT
    customer_id, order_date, total,
    LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS prev_order_date,
    order_date - LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS days_since_last_order
FROM orders;

PARTITION BY customer_id ensures LAG only looks at the same customer's previous row, not just the previous row in the whole table — essential whenever the row-to-row comparison should be scoped per group.

Detecting gaps or sequence breaks

SELECT id, LAG(id) OVER (ORDER BY id) AS prev_id
FROM records
WHERE id - LAG(id) OVER (ORDER BY id) > 1;   -- Note: filtering a window function needs a subquery/CTE wrapper in most engines

(In practice this filter needs to happen in an outer query/CTE, since WHERE can't reference a window function directly — see the execution-order question.) This pattern finds missing IDs in a sequence, a common data-quality check.