Structural differences
| SQL (relational) | NoSQL | |
|---|---|---|
| Schema | Fixed, defined upfront, enforced by the engine | Flexible/dynamic, often enforced (if at all) by the application |
| Data model | Tables, rows, columns, normalized via foreign keys | Documents, key-value pairs, wide columns, or graphs — varies by type |
| Relationships | Joins across normalized tables | Usually denormalized/embedded, or handled application-side |
| Query language | SQL (largely standardized) | Varies per product (MongoDB query language, Cypher for Neo4j, etc.) |
| Consistency model | Typically strong (ACID transactions) | Varies — many default to eventual consistency for scalability |
| Scaling | Traditionally vertical (bigger server); horizontal via sharding is possible but harder to bolt on | Many designed from the ground up for horizontal scaling/sharding |
Why NoSQL emerged
Relational databases enforce a fixed schema and strong consistency, which is exactly right for data with stable structure and strict correctness needs (financial ledgers, inventory), but can be a poor fit for: data whose shape varies significantly between records (product catalogs with wildly different attributes per category), extremely high write throughput distributed across many nodes, or workloads where slightly stale reads are an acceptable tradeoff for lower latency and higher availability at scale.
What you give up with most NoSQL systems
- Joins — most NoSQL databases either don't support them or support them poorly; the data model is usually designed to avoid needing them by embedding related data together (see the embedding vs. referencing question).
- Strict schema enforcement — flexibility cuts both ways: it's easy to evolve the data model, but it's also easy to accumulate inconsistent documents (some missing fields, some using different types for the "same" field) without the database itself catching it.
- Multi-record ACID transactions — historically weaker or entirely absent in early NoSQL systems (though many, like MongoDB since v4.0, have since added multi-document transaction support, narrowing this gap).
The realistic modern answer: polyglot persistence
Most non-trivial systems today use both — a relational database for the core transactional data (orders, accounts, inventory) where consistency and relationships matter most, plus one or more NoSQL stores for specific workloads that fit them better (a document store for a flexible content catalog, a key-value store for session/cache data, a graph database for a recommendation engine). The interview-relevant skill isn't "SQL vs. NoSQL" as an either/or — it's recognizing which data shape and consistency requirement each part of a system actually has, and picking the right tool per use case.
Related Resources
The three properties
- Consistency (C): every node returns the most recent write for any given read — all nodes see the same data at the same time.
- Availability (A): every request receives a (non-error) response, even if some nodes can't communicate with each other.
- Partition tolerance (P): the system continues operating even when network communication between nodes is disrupted (a "partition" — some nodes can't reach others).
Why it's really "pick 2 of 3" only during a partition
In a system with no network partitions, you can actually have both C and A simultaneously — the theorem's bite only applies during an actual partition event. And because partitions are a real, unavoidable fact of distributed systems (networks fail, nodes get cut off, packets get dropped), partition tolerance isn't really optional for any system that's genuinely distributed across multiple nodes — so the real-world choice collapses to CP vs. AP: when a partition happens, do you sacrifice consistency (keep serving requests, possibly with stale data) or sacrifice availability (refuse requests from the cut-off nodes until the partition heals, to guarantee consistency)?
CP systems — consistency over availability during a partition
When a partition occurs, a CP system will refuse to serve (or will block) requests on the minority/cut-off side rather than risk returning stale or conflicting data.
Examples: traditional relational databases in a synchronous-replication configuration, HBase, MongoDB (in its default configuration, favoring consistency via a single primary that must be reachable for writes), ZooKeeper/etcd (consensus-based coordination systems).
AP systems — availability over consistency during a partition
When a partition occurs, an AP system keeps accepting reads/writes on both sides of the partition, accepting that different sides may temporarily disagree — reconciling the divergence once the partition heals (see the eventual consistency question).
Examples: Cassandra, DynamoDB (in its default/eventually-consistent read mode), CouchDB.
Why this isn't really about "SQL vs NoSQL"
CAP is a property of a distributed system's design choices, not an inherent property of "relational" vs. "NoSQL" — a single-node relational database isn't meaningfully subject to CAP at all (there's nothing to partition), but a multi-region relational deployment absolutely is, and some NoSQL systems (like MongoDB by default) actually lean CP rather than AP. Many modern databases also let you tune this per-operation (e.g., Cassandra's per-query consistency levels, DynamoDB's strongly-consistent vs. eventually-consistent reads) rather than being a single fixed system-wide choice.
A strong answer doesn't just recite "Consistency, Availability, Partition tolerance" — it explains why the real tradeoff is CP vs. AP (since P is non-negotiable for a genuinely distributed system), and can name what a specific real system chooses and why that fits its use case (e.g., a banking ledger favoring CP because stale balance reads are unacceptable; a social media "like counter" favoring AP because a few seconds of staleness is an acceptable tradeoff for never showing an error to the user).
Related Resources
Document stores — MongoDB, Couchbase, Firestore
Store self-contained, semi-structured documents (typically JSON/BSON), where related data is often embedded directly in one document rather than normalized across tables.
{
"_id": "user_123",
"name": "Alice",
"addresses": [
{"type": "home", "city": "Austin"},
{"type": "work", "city": "Dallas"}
]
}
Best fit: content with variable/nested structure per record, rapid schema iteration, read patterns that naturally want "the whole object" in one fetch.
Key-value stores — Redis, DynamoDB, Memcached
The simplest model: a unique key maps to an opaque value (string, blob, or a richer structure in Redis's case — lists, sets, hashes). No querying by value content in the basic model — you fetch by key, full stop.
SET session:abc123 '{"user_id": 42, "expires": "2026-07-05T00:00:00Z"}'
GET session:abc123
Best fit: caching, session storage, feature flags, rate limiting — anything with simple, extremely high-throughput lookups by a known key.
Wide-column stores — Cassandra, HBase, Google Bigtable
Rows can have a different, sparse set of columns, and columns are grouped into "column families" stored together on disk — optimized for very high write throughput and horizontal scale across many nodes, with each row addressable by a partition key.
Row key: user_123
Column family "profile": name=Alice, email=alice@example.com
Column family "activity": last_login=2026-07-01, login_count=57
Best fit: time-series data, IoT sensor readings, massive-scale write-heavy workloads (Cassandra was originally built at Facebook for exactly this kind of scale).
Graph databases — Neo4j, Amazon Neptune, ArangoDB
Model data explicitly as nodes (entities) and edges (relationships), with relationships as first-class citizens that can themselves carry properties — optimized for traversing and querying connections, not just individual records.
MATCH (a:Person {name: 'Alice'})-[:FOLLOWS]->(b:Person)-[:FOLLOWS]->(c:Person)
WHERE NOT (a)-[:FOLLOWS]->(c)
RETURN c AS suggested_follow
Best fit: social networks, recommendation engines, fraud detection (tracing chains of connected transactions), and any domain where "how are these things related, possibly several hops deep" is the core query pattern — a relationship a relational database would need several expensive joins (or a recursive CTE) to express.
Choosing between them
The decision should follow from your actual query patterns: "I always fetch this whole record together" points to document; "I only ever look things up by a single known key" points to key-value; "I write enormous volumes of data that rarely needs complex ad-hoc querying" points to wide-column; "my core question is about relationships/paths between entities" points to graph. Defaulting to relational and only reaching for one of these when the data shape or scale genuinely demands it is usually the right instinct.
Related Resources
Good fit signals
Variable, nested, or evolving schema per record. A product catalog where a "shoe" and a "laptop" have almost entirely different sets of attributes maps naturally onto flexible documents; forcing it into a fixed relational schema means either a very wide table full of mostly-NULL columns, or an entity-attribute-value pattern that's awkward to query.
{ "type": "shoe", "size": 10, "width": "medium", "color": "black" }
{ "type": "laptop", "ram_gb": 16, "screen_inches": 14, "cpu": "M3" }
Read/write "the whole object" as a unit. If your application almost always fetches an entire user profile (with nested preferences, addresses, recent activity) in one shot, storing it as one document avoids the join cost of assembling it from several normalized tables every time.
Rapid schema iteration without coordinated migrations. Adding a new optional field to a document requires no schema migration — new documents can simply include it, older documents without it are handled with a default in application code. A relational schema change (ALTER TABLE ADD COLUMN) is usually safe too in modern engines, but document stores make this iteration even more frictionless for genuinely unstructured/varying data.
Horizontal write scaling. Document databases like MongoDB are built with sharding as a first-class, well-supported feature from the ground up, often making it more straightforward to scale write throughput horizontally than retrofitting sharding onto a relational deployment.
Poor fit signals — stick with relational
Deeply relational data with many meaningful many-to-many relationships. If your domain genuinely has complex, normalized relationships that need to be queried from many different angles (not just "fetch this one entity and its nested children"), you'll end up either duplicating data across many documents (with the sync/consistency problems that brings) or re-implementing joins in application code — usually a worse position than just using a relational database with proper foreign keys.
Strong consistency across multiple related entities. If a business operation must atomically update several distinct entities together with strict guarantees (a financial transaction touching multiple accounts), a relational database's mature multi-table transaction support is the safer, more battle-tested default — even though modern MongoDB does support multi-document ACID transactions, it's a comparatively newer feature.
Complex ad-hoc reporting/analytics across the full dataset. SQL's join and aggregation capabilities, plus the surrounding tooling ecosystem (BI tools, reporting frameworks), are generally more mature for cross-cutting analytical queries than most document databases' query languages.
The realistic answer
Most production systems benefit from evaluating this per data domain, not system-wide — a core "orders and accounts" domain often stays relational for consistency and reporting, while a "product catalog" or "user activity feed" domain might genuinely be better served by a document store, coexisting in the same overall architecture (polyglot persistence).
Related Resources
What it means concretely
Client writes "likes = 101" to Replica A.
Replica A immediately has the new value.
Replica B and Replica C haven't received the replication update yet.
A read hitting Replica B right now might still return "likes = 100" --
temporarily inconsistent with Replica A's already-committed value.
Given enough time (milliseconds to seconds, typically), replication
catches up, and B and C converge to "likes = 101" too.
This is the AP side of the CAP theorem in practice: rather than blocking the read (or the write) until every replica agrees (which would sacrifice availability/latency), the system accepts reads that might be momentarily stale, trusting that convergence happens shortly afterward.
When it's a fine tradeoff
- Social media counters (likes, view counts, follower counts) — a few seconds of staleness or a slightly-off count is invisible/irrelevant to users, and the alternative (blocking every like/view to synchronously update every replica) would add unacceptable latency at massive scale for negligible correctness benefit.
- Non-critical caches — a cache that's occasionally a few seconds stale is, by definition, an acceptable tradeoff (that's the whole premise of caching).
- Content/CDN distribution — a page update propagating to edge servers over a few seconds/minutes is standard and expected.
- DNS — the textbook example of eventual consistency at internet scale: DNS record changes propagate over time (per TTL), and the world briefly seeing old vs. new records simultaneously is an accepted, designed-for tradeoff.
When it's dangerous
- Financial balances/ledgers — reading a stale balance and allowing a withdrawal based on it can produce genuine financial loss (overdraft that shouldn't have been allowed).
- Inventory that must never oversell — two replicas both believing "1 item left" and both allowing a purchase results in overselling.
- Security/permission checks — a stale "user has access" read could grant access that was just revoked, a real security gap.
- Uniqueness enforcement — two concurrent writes on different replicas, both believing a username is available, can both succeed, violating a uniqueness invariant the business actually depends on.
The skill being tested isn't "know the definition" — it's being able to reason about which specific pieces of data in a system can tolerate eventual consistency and which genuinely can't, and choosing a storage/consistency strategy per data type accordingly (e.g., storing account balances in a strongly-consistent store while storing view counts in an eventually-consistent one, even within the same overall application).