Security, Performance & Scaling

Difficulty

Answer: Node apps face the usual web risks plus a large supply-chain surface via node_modules.

Common vulnerabilities and fixes:

  1. Injection (SQL/NoSQL/command): never build queries by string concatenation.
// ❌ SQL injection
db.query(`SELECT * FROM users WHERE id = ${req.params.id}`);
// ✅ Parameterized
db.query('SELECT * FROM users WHERE id = $1', [req.params.id]);

For child_process, avoid exec with user input; use execFile/spawn with an args array.

  1. Cross-site scripting (XSS): escape/encode output; sanitize any HTML (DOMPurify on rendered content); set a Content-Security-Policy.

  2. Insecure dependencies (supply chain): run npm audit, pin versions with a lockfile, use npm ci, and tools like Dependabot/Snyk. Most Node vulns come from transitive deps.

  3. Secrets in code: keep credentials in environment variables / a secrets manager, never in the repo. Add .env to .gitignore.

  4. Missing security headers: use helmet to set sensible defaults (HSTS, X-Content-Type-Options, CSP, etc.).

app.use(require('helmet')());
  1. Unvalidated input / mass assignment: validate and whitelist fields (zod, joi); don't blindly spread req.body into DB models.

  2. CORS & rate limiting: restrict origins with the cors middleware; throttle with express-rate-limit to blunt brute-force/DoS.

  3. ReDoS / large payloads: cap body size (express.json({ limit: '100kb' })), avoid catastrophic regexes.

Baseline checklist: HTTPS everywhere, helmet, input validation, parameterized queries, npm audit in CI, secrets in env, least-privilege DB users, and up-to-date Node.

Answer:

Password storage — hash, never encrypt or store plaintext: Use a slow, salted, adaptive hash (bcrypt, scrypt, or argon2) so brute force is expensive:

const bcrypt = require('bcrypt');

// On signup
const hash = await bcrypt.hash(password, 12); // 12 = cost factor

// On login
const ok = await bcrypt.compare(password, user.hash);

Bcrypt salts automatically and lets you raise the cost over time. Never use fast hashes (MD5/SHA-256) for passwords.

Authentication approaches:

Sessions (stateful):

  • Server stores session data (in Redis/DB); the client holds only a session id in a cookie.
  • Pros: easy to revoke (delete the session), small cookie, server controls state.
  • Cons: needs shared session storage when scaling to multiple instances.

JWT (stateless):

  • A signed token (header.payload.signature) containing claims; the client sends it (usually Authorization: Bearer) on each request. The server verifies the signature — no lookup needed.
  • Pros: stateless → scales horizontally without shared session store; works well across services.
  • Cons: hard to revoke before expiry; token bloat; you must protect the secret and keep expiry short.
const jwt = require('jsonwebtoken');
const token = jwt.sign({ sub: user.id }, process.env.JWT_SECRET, { expiresIn: '15m' });
const payload = jwt.verify(token, process.env.JWT_SECRET);

Common production pattern: short-lived access JWT + longer-lived refresh token stored server-side (revocable), combining scalability with control.

Also: store tokens safely (httpOnly, Secure cookies mitigate XSS token theft), enforce HTTPS, and rate-limit auth endpoints.

Answer:

Why cache: avoid repeating expensive work — DB queries, external API calls, heavy computation — for data that's read often and changes rarely.

Layers of caching:

  1. In-process memory (a Map or lru-cache): fastest, zero network hop.
    • ❌ Not shared across instances; lost on restart; can bloat memory. Fine for small, hot, per-instance data.
  2. Distributed cache (Redis/Memcached): shared across all instances, survives restarts, supports TTLs and rich structures.
async function getUser(id) {
  const cached = await redis.get(`user:${id}`);
  if (cached) return JSON.parse(cached);

  const user = await db.getUser(id);
  await redis.set(`user:${id}`, JSON.stringify(user), 'EX', 300); // 5 min TTL
  return user;
}
  1. HTTP caching (Cache-Control/ETag) and a CDN for static or public responses — offload work entirely.

Invalidation — the hard part:

  • TTL/expiry — simplest; accept slightly stale data for a bounded window.
  • Write-through / explicit invalidation — delete or update the cache key when the underlying data changes.
async function updateUser(id, data) {
  const user = await db.updateUser(id, data);
  await redis.del(`user:${id}`);   // invalidate on write
  return user;
}

Watch out for:

  • Stale data — pick TTLs that match tolerance for staleness.
  • Cache stampede — many misses hitting the DB at once when a hot key expires (mitigate with locks/"single-flight" or jittered TTLs).
  • Don't cache per-user sensitive data in shared/public caches.

Rule: cache read-heavy, change-rarely data; always have an invalidation strategy (at minimum a TTL).

Answer: Because your JavaScript runs on one thread, scaling means running more Node processes — on the same machine and across machines.

1. Use all cores on one machine — cluster / PM2:

const cluster = require('cluster');
const os = require('os');
if (cluster.isPrimary) {
  os.cpus().forEach(() => cluster.fork()); // one worker per core
  cluster.on('exit', () => cluster.fork()); // restart crashed workers
} else {
  require('./server'); // each worker runs the HTTP server, sharing the port
}

In practice, PM2 does this for you (pm2 start app.js -i max) and adds restarts, zero-downtime reloads, and monitoring.

2. Scale horizontally — many instances behind a load balancer:

  • Run N containers/VMs; put Nginx / a cloud LB / Kubernetes in front.
  • The orchestrator handles health checks, rolling deploys, and autoscaling.

3. The enabling requirement — statelessness:

  • Don't keep session state, caches, or uploaded files in a single process's memory (another instance won't have it).
  • Externalize state: sessions/cache → Redis, files → object storage (S3), pub/sub across instances → Redis/message broker.
  • With stateless instances, any request can hit any instance — the basis of easy horizontal scaling.

4. Other levers:

  • Offload CPU work to worker_threads or a background job queue so the request path stays responsive.
  • Cache and a CDN to reduce load.
  • Graceful shutdown so rolling deploys drop zero requests.

Interview summary: cluster/PM2 for cores, load-balanced stateless instances for machines, Redis/object-storage to externalize state.

Answer: Optimize based on measurement, not guesses.

1. Profile CPU usage:

  • node --prof app.js then node --prof-process for a text report.
  • node --inspect + Chrome DevTools, or clinic.js (clinic flame, clinic doctor) for flame graphs.
  • Look for hot functions and synchronous work on the request path.

2. Monitor event-loop lag: High lag means something is blocking the loop.

const { monitorEventLoopDelay } = require('perf_hooks');
const h = monitorEventLoopDelay(); h.enable();
setInterval(() => console.log('loop p99 (ms):', h.percentile(99) / 1e6), 5000);

3. Track memory / find leaks:

  • Watch process.memoryUsage() (RSS/heapUsed) over time; steady growth suggests a leak.
  • Take heap snapshots in DevTools and compare to find retained objects (common causes: unbounded caches/Maps, un-removed event listeners, closures holding large data, growing global arrays).

Common bottlenecks and fixes:

SymptomLikely causeFix
High event-loop lagCPU work on main threadoffload to worker_threads/queue; chunk work
Slow endpoints, high DB loadN+1 queries, missing indexesbatch/join queries, add indexes, cache
Memory grows unboundedleak (caches/listeners/closures)bound caches (LRU+TTL), remove listeners
High memory on big payloadsbuffering large datause streams
Latency spikes under loadthread-pool contention, no poolingraise UV_THREADPOOL_SIZE, use connection pools, limit concurrency

Also: enable gzip/compression, use HTTP keep-alive and DB connection pools, and add caching/CDN. Always re-measure after each change to confirm the win.