What's the difference between vertical and horizontal scaling for databases?
Quick Answer
Vertical scaling (scaling up) means adding more resources — CPU, RAM, faster disks — to a single database server. Horizontal scaling (scaling out) means distributing data and load across multiple servers (via replication and/or sharding). Vertical scaling is simpler (no application changes) but hits a hard ceiling and creates a single point of failure; horizontal scaling has much higher ceilings and improves availability, but adds real distributed-systems complexity.
Detailed Answer
Vertical scaling — bigger machine
Before: 8 vCPU, 32GB RAM database server
After: 32 vCPU, 128GB RAM database server (same single server, upgraded)
Pros: requires no application-level changes — the database is still one logical instance, transactions and joins work exactly as before, no new distributed-systems concerns introduced. Simplest option to reason about.
Cons: there's a hard physical/economic ceiling — eventually you run out of bigger hardware to buy, or it becomes prohibitively expensive. It also doesn't improve availability — a single, larger server is still a single point of failure; if it goes down, the whole database is down.
Horizontal scaling — more machines
Before: 1 database server handling all reads and writes
After: 1 primary (writes) + several read replicas (reads),
or several shards, each holding a subset of the data
Pros: much higher scaling ceiling (in principle, keep adding machines), and can improve availability (a replica can be promoted if the primary fails — see the failover question).
Cons: introduces real distributed-systems complexity — replication lag, choosing a sharding key (see that question), cross-shard queries/joins becoming expensive or impossible, and generally more operational surface area (more machines to monitor, patch, and reason about failure modes for).
How they typically combine in practice
Most systems scale vertically first (it's cheap and simple, and modern hardware ceilings are quite high) and only reach for horizontal scaling once vertical scaling is exhausted or availability requirements demand redundancy regardless of raw capacity needs. Read scaling is usually the first horizontal step (read replicas — see that question), since most application workloads are read-heavy and reads are easier to distribute than writes; write scaling (sharding) is a bigger architectural commitment, usually reserved for when a single primary genuinely can't keep up with write volume.
A strong answer recognizes vertical scaling isn't "the naive option to outgrow" — it's often the right first move because of its simplicity, and premature horizontal scaling (sharding a dataset that would fit comfortably on a bigger single server) adds real complexity for no corresponding benefit.