What is a write-ahead log (WAL), and what role does it play in durability and replication?

Detailed Answer

The core idea: log first, apply later

1. Transaction commits: UPDATE accounts SET balance = 500 WHERE id = 1;
2. BEFORE modifying the actual data page on disk, the engine writes a
   WAL record: "change account id=1's balance to 500" to a sequential,
   append-only log file, and durably flushes (fsyncs) that log entry.
3. ONLY THEN does the engine acknowledge the commit to the client as successful.
4. The actual data page update can happen later (asynchronously,
   often batched with other changes for efficiency) -- it's no longer
   urgent, because the WAL already durably captured the intent.

If the server crashes at any point after step 3 but before the data page is actually updated on disk, the WAL entry survives (it was already durably flushed) — on restart, the engine replays any WAL entries not yet reflected in the data files, recovering the change that would otherwise have been lost. This is the mechanism that actually delivers ACID's Durability guarantee.

Why log-then-apply is faster, not just safer

Writing a compact, sequential log entry is much cheaper than immediately updating (and durably flushing) the actual data page, which might be scattered randomly across disk and require more complex I/O. Sequential log writes are also friendlier to disk hardware (especially spinning disks, though the benefit is smaller — if still real — on SSDs) than random-access writes to data files. So WAL isn't purely a safety mechanism — it also lets the engine defer and batch the more expensive data-file writes while still providing an immediate durability guarantee via the cheap sequential log write.

WAL's second job: replication

Because the WAL is a complete, ordered record of every change made to the database, shipping that exact log stream to another server and replaying it there reconstructs an identical copy of the data — this is precisely how physical/log-shipping replication works (PostgreSQL's streaming replication, MySQL's binary log replication is a related-but-distinct mechanism serving the same purpose). The replica doesn't need to re-execute the original SQL statements — it just applies the same low-level logged changes the primary already made, in the same order.

Primary: WAL stream ---> shipped continuously ---> Replica: replays WAL entries

Why this matters for backup/recovery too

WAL archiving (continuously saving WAL files to durable storage) is exactly what enables point-in-time recovery (see the backup/DR question) — replaying WAL entries up to any specific moment reconstructs the database's exact state at that moment, not just at the last full backup's timestamp.

Understanding WAL connects several otherwise-separate-seeming topics — durability, crash recovery, replication, and point-in-time backup/restore — as different applications of the same underlying mechanism, which is exactly the kind of "sees how the pieces fit together" understanding a senior-level interview question is probing for.

What is a write-ahead log (WAL), and what role does it play in durability and replication?

Quick Answer

Detailed Answer

The core idea: log first, apply later

Why log-then-apply is faster, not just safer

WAL's second job: replication

Why this matters for backup/recovery too

Related Resources

Related Questions