ACID describes the guarantees a database transaction makes, letting you reason about a group of statements as a single, safe operation even in the presence of concurrent access and crashes.
Atomicity — all or nothing
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- debit
UPDATE accounts SET balance = balance + 100 WHERE id = 2; -- credit
COMMIT;
If the second UPDATE fails (constraint violation, crash, connection drop) before COMMIT, the first UPDATE must also be undone — you should never end up with money debited from account 1 but not credited to account 2. Without atomicity, every multi-step write needs manual, error-prone compensating logic in application code.
Consistency — valid state to valid state
The transaction must leave the database satisfying all defined constraints (NOT NULL, CHECK, foreign keys, unique constraints) — if a CHECK (balance >= 0) constraint exists, no committed transaction can ever leave a negative balance, even mid-transaction states are allowed to (temporarily) violate it as long as they don't at commit time. Note: this is the least precisely defined of the four letters, and is partly just "atomicity + isolation + valid constraints together imply the DB stays consistent," rather than a fully independent mechanism.
Isolation — concurrent transactions don't interfere
-- Transaction A -- Transaction B
BEGIN;
UPDATE accounts SET balance = 500
WHERE id = 1;
BEGIN;
SELECT balance FROM accounts
WHERE id = 1; -- should NOT see 500 yet
-- (depending on isolation level)
COMMIT;
Isolation determines exactly what "shouldn't see yet" means in practice — this is where the four standard isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) come in, each allowing or preventing different classes of interference (see that question).
Durability — survives a crash
COMMIT;
-- Power fails immediately after this returns successfully to the client.
-- On restart, the committed data MUST still be there.
Achieved via a write-ahead log (WAL) — the engine writes a durable log record of the change before acknowledging the commit, so it can replay the log to recover committed-but-not-yet-flushed-to-disk data after a crash (see the WAL question in the scaling/HA topic).
Why this matters for interviews
ACID isn't just trivia — it's the contract that lets you write BEGIN ... COMMIT blocks around multi-step business logic (like a funds transfer) and trust the database to handle failure and concurrency correctly, instead of hand-rolling that safety in application code. Being able to explain a concrete failure each property prevents (not just recite the acronym) is what distinguishes a strong answer.
Related Resources
The three anomalies
| Anomaly | Description |
|---|---|
| Dirty read | Transaction A reads data that Transaction B has written but not yet committed. If B rolls back, A read data that "never really happened." |
| Non-repeatable read | Transaction A reads a row, Transaction B commits an update to that same row, A reads it again and gets a different value within the same transaction. |
| Phantom read | Transaction A runs a query with a WHERE filter, Transaction B commits a new row matching that filter, A re-runs the same query and sees an extra row that wasn't there before. |
The four levels
| Level | Dirty read | Non-repeatable read | Phantom read |
|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible |
| Read Committed | Prevented | Possible | Possible |
| Repeatable Read | Prevented | Prevented | Possible (per SQL standard) |
| Serializable | Prevented | Prevented | Prevented |
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM accounts WHERE id = 1; -- e.g., 500
-- (another transaction commits a change to this row)
SELECT balance FROM accounts WHERE id = 1; -- still 500, guaranteed, under REPEATABLE READ
COMMIT;
Engine-specific reality checks
- Read Uncommitted is rarely meaningfully different from Read Committed in practice — PostgreSQL doesn't implement it at all (it silently upgrades to Read Committed); it exists mostly in the standard and in SQL Server.
- PostgreSQL's Repeatable Read actually prevents phantom reads too (stricter than the SQL standard requires for this level), because it's implemented via snapshot isolation rather than row-level locking — worth knowing that "the same isolation level name" doesn't guarantee identical behavior across engines.
- MySQL/InnoDB's default is Repeatable Read, while PostgreSQL's and SQL Server's default is Read Committed — a common source of subtly different application behavior when porting code between engines without adjusting isolation level assumptions.
The tradeoff
Higher isolation levels give stronger guarantees but at a real cost: more locking/blocking, more transaction aborts due to serialization conflicts (an application must be prepared to retry a transaction that fails at Serializable), and lower overall throughput under contention. Read Committed is the practical default for most OLTP workloads because it prevents the most dangerous anomaly (dirty reads) cheaply; Serializable is reserved for logic that's genuinely sensitive to subtle concurrency bugs (e.g., enforcing an invariant across multiple rows, like "total allocated seats can never exceed capacity").
Don't reach for Serializable by default "to be safe" — it can meaningfully hurt throughput and requires retry logic for serialization failures. Instead, identify which specific anomaly your business logic is actually vulnerable to, and choose the lowest isolation level that prevents it (often supplemented with an explicit row lock via SELECT ... FOR UPDATE rather than raising the whole transaction's isolation level).
Related Resources
The problem MVCC solves
A simpler concurrency model — pure locking, no versioning — would require every reader to acquire a shared lock and every writer to acquire an exclusive lock, meaning a long-running read blocks writers, and a write blocks all readers until it commits. That's simple but kills concurrency for read-heavy workloads.
How MVCC works
Instead of updating a row in place, an UPDATE (conceptually) creates a new version of the row, tagged with the transaction ID that created it, while the old version stays around (marked as superseded, but not yet physically removed) as long as any active transaction might still need to see it.
Row (id=1), before update: [ version 1: balance=500, created_by=txn_10, valid_until=txn_15 ]
After UPDATE by txn_15: [ version 1: ...valid_until=txn_15 ]
[ version 2: balance=400, created_by=txn_15, valid_until=(open) ]
Each transaction operates against a snapshot — effectively "the set of row versions committed as of the moment my transaction/statement started." A reader's SELECT simply picks the version of each row that was valid as of its snapshot, entirely ignoring whatever a concurrent writer is doing to create newer versions.
The key benefit: readers don't block writers, and vice versa
-- Transaction A (long-running report query)
BEGIN;
SELECT SUM(balance) FROM accounts; -- takes 30 seconds, sees a consistent snapshot
-- Transaction B, running concurrently, is NOT blocked by A's SELECT
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT; -- succeeds immediately, doesn't wait for A's SELECT to finish
Transaction A's report simply doesn't see B's update (it wasn't committed as of A's snapshot) — it gets a consistent, if slightly stale, view rather than being blocked or seeing a torn/partial state.
The cost: old versions must be cleaned up
Since old row versions stick around until no transaction could possibly need them, MVCC engines need a garbage-collection mechanism:
- PostgreSQL:
VACUUM(usually viaautovacuum) reclaims space from dead row versions ("dead tuples"). Under-vacuumed tables can bloat significantly, and in extreme, badly-managed cases risk transaction ID wraparound issues. - MySQL/InnoDB: maintains an undo log of old versions, purged by a background thread once no longer needed; a very long-running transaction can bloat the undo log by preventing purges.
- Oracle: similarly uses undo segments/tablespaces for the same purpose.
Why this matters for interviews
MVCC explains a lot of otherwise-confusing behavior: why a SELECT in PostgreSQL basically never blocks on a concurrent UPDATE, why a long transaction can cause table bloat, and why "Repeatable Read" is comparatively cheap to implement in an MVCC engine (it's mostly "pick your snapshot once, at transaction start, instead of per-statement") compared to a purely lock-based concurrency model.
Related Resources
Pessimistic concurrency control
Lock the row before doing anything, so no one else can modify it until you're done:
BEGIN;
SELECT * FROM seats WHERE id = 42 FOR UPDATE; -- acquires an exclusive row lock
-- ... application logic decides whether the seat can be booked ...
UPDATE seats SET status = 'booked' WHERE id = 42;
COMMIT; -- lock released here
Any other transaction trying to SELECT ... FOR UPDATE (or update) row 42 simply blocks until this transaction commits or rolls back. Pros: conceptually simple, guaranteed to prevent conflicts, no retry logic needed. Cons: reduces concurrency (other transactions wait), and holding locks across slow operations (a network call, user think-time) can cause serious contention or even deadlocks under load.
Optimistic concurrency control
Read without locking, then verify at write time that nothing changed:
-- Read
SELECT id, status, version FROM seats WHERE id = 42;
-- app gets: status='available', version=7
-- ... application logic, possibly slow (user confirms booking) ...
-- Write: only succeeds if version is still 7
UPDATE seats SET status = 'booked', version = version + 1
WHERE id = 42 AND version = 7;
-- Check rows affected: if 0, someone else updated it first -- retry or fail
Pros: no locks held during the "thinking" period, so no blocking of other transactions, and no deadlock risk from this pattern. Cons: requires a version/timestamp column and retry logic in the application, and under high contention, many transactions may repeatedly fail and retry (worse throughput than a lock would have given, ironically) — optimism is a bad bet exactly when conflicts are actually common.
When to use which
- Pessimistic: high contention on the same rows, or when the cost of retrying a failed operation is high (e.g., a multi-step external side effect that's hard to safely redo).
- Optimistic: low contention, read-heavy workloads, or web applications where holding a database lock across a slow client round-trip (think-time) would be unacceptable — e.g., editing a document where two users rarely edit the exact same record simultaneously.
Where this shows up in ORMs
Most ORMs (Entity Framework, Hibernate, etc.) implement optimistic concurrency natively via a RowVersion/@Version column, throwing a concurrency exception on a version mismatch that the application must catch and handle (typically by reloading and prompting the user, or retrying). This is usually preferred over pessimistic locking in typical web applications specifically because holding a database transaction open across an HTTP request/user interaction is almost always the wrong design.
Related Resources
A minimal deadlock example
-- Transaction A -- Transaction B
BEGIN; BEGIN;
UPDATE accounts SET balance = balance - 10
WHERE id = 1; UPDATE accounts SET balance = balance - 10
-- A now holds a lock on row 1 WHERE id = 2;
-- B now holds a lock on row 2
UPDATE accounts SET balance = balance + 10
WHERE id = 2; UPDATE accounts SET balance = balance + 10
-- A waits for B's lock on row 2 WHERE id = 1;
-- B waits for A's lock on row 1
-- DEADLOCK: neither can proceed
A is waiting for a lock B holds; B is waiting for a lock A holds. Neither will ever release, without intervention.
Detection
Most engines maintain a wait-for graph — a graph where an edge from transaction X to transaction Y means "X is waiting on a lock held by Y." Periodically (or on each new lock wait), the engine checks this graph for a cycle — a cycle means a deadlock exists. This is generally cheaper and faster than the alternative (a pure timeout, used by some simpler systems), since it detects the problem the moment a cycle forms rather than waiting for an arbitrary timeout to expire.
Resolution
Once a cycle is found, the database picks a victim transaction — typically the one that would be cheapest to roll back (least work done, fewest locks held, or simply the one that most recently joined the cycle, depending on engine) — and forcibly aborts it with a deadlock error, releasing its locks so the other transaction(s) can proceed.
ERROR: deadlock detected
DETAIL: Process 1234 waits for ShareLock on transaction 5678; blocked by process 5678.
Process 5678 waits for ShareLock on transaction 1234; blocked by process 1234.
HINT: See server log for query details.
The application's responsibility
A deadlock is not a bug in the database — it's an expected, occasional outcome of concurrent access patterns, and the losing transaction's work is fully rolled back (atomicity holds). Applications must catch this specific error and retry the whole transaction from the beginning (not just the last statement), typically with a short randomized backoff to avoid immediately re-colliding with the same transaction.
How to reduce deadlock frequency
- Access rows/tables in a consistent order across all transactions — in the example above, if both transactions always updated account 1 before account 2, no cycle could ever form.
- Keep transactions short — the longer a transaction holds locks, the more opportunity for another transaction to also be waiting on something it holds.
- Use the lowest isolation level that satisfies your correctness needs (higher isolation levels generally take more/broader locks, or in MVCC engines, cause more serialization failures which are a related-but-distinct phenomenon from classic lock deadlocks).
- Consider explicit, application-level lock ordering (e.g., always lock the row with the lower ID first) for code paths known to touch multiple rows that could also be touched in the opposite order elsewhere.