What are the main NoSQL data models, with an example database for each?
Quick Answer
**Document** stores (MongoDB, Couchbase) store semi-structured JSON/BSON-like documents. **Key-value** stores (Redis, DynamoDB) map a unique key to an opaque value, optimized for simple, extremely fast lookups. **Wide-column** stores (Cassandra, HBase, Bigtable) organize data into rows with dynamic, sparse columns grouped into column families, built for massive write scale. **Graph** databases (Neo4j, Amazon Neptune) model data as nodes and relationships, optimized for traversing connections.
Detailed Answer
Document stores — MongoDB, Couchbase, Firestore
Store self-contained, semi-structured documents (typically JSON/BSON), where related data is often embedded directly in one document rather than normalized across tables.
{
"_id": "user_123",
"name": "Alice",
"addresses": [
{"type": "home", "city": "Austin"},
{"type": "work", "city": "Dallas"}
]
}
Best fit: content with variable/nested structure per record, rapid schema iteration, read patterns that naturally want "the whole object" in one fetch.
Key-value stores — Redis, DynamoDB, Memcached
The simplest model: a unique key maps to an opaque value (string, blob, or a richer structure in Redis's case — lists, sets, hashes). No querying by value content in the basic model — you fetch by key, full stop.
SET session:abc123 '{"user_id": 42, "expires": "2026-07-05T00:00:00Z"}'
GET session:abc123
Best fit: caching, session storage, feature flags, rate limiting — anything with simple, extremely high-throughput lookups by a known key.
Wide-column stores — Cassandra, HBase, Google Bigtable
Rows can have a different, sparse set of columns, and columns are grouped into "column families" stored together on disk — optimized for very high write throughput and horizontal scale across many nodes, with each row addressable by a partition key.
Row key: user_123
Column family "profile": name=Alice, email=alice@example.com
Column family "activity": last_login=2026-07-01, login_count=57
Best fit: time-series data, IoT sensor readings, massive-scale write-heavy workloads (Cassandra was originally built at Facebook for exactly this kind of scale).
Graph databases — Neo4j, Amazon Neptune, ArangoDB
Model data explicitly as nodes (entities) and edges (relationships), with relationships as first-class citizens that can themselves carry properties — optimized for traversing and querying connections, not just individual records.
MATCH (a:Person {name: 'Alice'})-[:FOLLOWS]->(b:Person)-[:FOLLOWS]->(c:Person)
WHERE NOT (a)-[:FOLLOWS]->(c)
RETURN c AS suggested_follow
Best fit: social networks, recommendation engines, fraud detection (tracing chains of connected transactions), and any domain where "how are these things related, possibly several hops deep" is the core query pattern — a relationship a relational database would need several expensive joins (or a recursive CTE) to express.
Choosing between them
The decision should follow from your actual query patterns: "I always fetch this whole record together" points to document; "I only ever look things up by a single known key" points to key-value; "I write enormous volumes of data that rarely needs complex ad-hoc querying" points to wide-column; "my core question is about relationships/paths between entities" points to graph. Defaulting to relational and only reaching for one of these when the data shape or scale genuinely demands it is usually the right instinct.