How do you scale a Node.js application across cores and machines?

3 minadvancednodejsscalingclusterpm2load-balancing

Quick Answer

A single Node process uses one CPU core for JS, so scale vertically with the cluster module or PM2 (one worker per core sharing the port), and horizontally by running stateless instances behind a load balancer. Keep the app stateless (externalize sessions/cache to Redis) so any instance can serve any request.

Detailed Answer

Answer: Because your JavaScript runs on one thread, scaling means running more Node processes — on the same machine and across machines.

1. Use all cores on one machine — cluster / PM2:

const cluster = require('cluster');
const os = require('os');
if (cluster.isPrimary) {
  os.cpus().forEach(() => cluster.fork()); // one worker per core
  cluster.on('exit', () => cluster.fork()); // restart crashed workers
} else {
  require('./server'); // each worker runs the HTTP server, sharing the port
}

In practice, PM2 does this for you (pm2 start app.js -i max) and adds restarts, zero-downtime reloads, and monitoring.

2. Scale horizontally — many instances behind a load balancer:

  • Run N containers/VMs; put Nginx / a cloud LB / Kubernetes in front.
  • The orchestrator handles health checks, rolling deploys, and autoscaling.

3. The enabling requirement — statelessness:

  • Don't keep session state, caches, or uploaded files in a single process's memory (another instance won't have it).
  • Externalize state: sessions/cache → Redis, files → object storage (S3), pub/sub across instances → Redis/message broker.
  • With stateless instances, any request can hit any instance — the basis of easy horizontal scaling.

4. Other levers:

  • Offload CPU work to worker_threads or a background job queue so the request path stays responsive.
  • Cache and a CDN to reduce load.
  • Graceful shutdown so rolling deploys drop zero requests.

Interview summary: cluster/PM2 for cores, load-balanced stateless instances for machines, Redis/object-storage to externalize state.