Databases & ORMs

Difficulty

Answer: The choice depends on your data shape, consistency requirements, and access patterns, not on Node itself (Node has mature drivers for both).

Relational / SQL (PostgreSQL, MySQL):

  • Structured data with clear relationships; enforced schema.
  • Strong ACID transactions and joins.
  • Great when data integrity and complex queries matter (finance, orders, anything relational).
SELECT u.name, COUNT(o.id) AS orders
FROM users u JOIN orders o ON o.user_id = u.id
GROUP BY u.id;

Document / NoSQL (MongoDB):

  • Flexible, evolving schemas; store denormalized documents matching how you read them.
  • Horizontal scaling and high write throughput for simple access patterns.
  • Good for content, catalogs, event logs, rapidly changing shapes.
db.orders.find({ userId, status: 'paid' });

Decision factors:

FactorLean SQLLean NoSQL
Data structurerelational, uniformflexible, nested, varied
Transactions/joinsneededrarely needed
Schema stabilitystableevolving fast
Scale patternvertical + read replicashorizontal sharding
Query complexitycomplex, ad-hocknown, document-shaped

Reality check:

  • Postgres now has strong JSONB support, so "I need flexible fields" isn't automatically NoSQL.
  • Many systems use both (polyglot persistence): Postgres for core entities, Redis for cache, a document/search store for specific needs.
  • Pick based on the dominant access pattern; don't choose NoSQL just to avoid schema design.

Answer:

The spectrum, from most control to most abstraction:

  1. Raw driver (pg, mysql2, mongodb) — you write SQL/queries yourself.
  2. Query builder (Knex) — compose SQL with a fluent JS API.
  3. ORM (Prisma, Sequelize, TypeORM, Mongoose) — map tables/collections to objects/models, generate queries, manage migrations.

Raw driver:

const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

Max control and performance, but lots of boilerplate and manual mapping.

ORMs compared:

ToolStyleNotes
Prismaschema-first, generated clientExcellent TypeScript types, great DX, explicit migrations; less flexible for very complex/raw SQL (has $queryRaw escape hatch).
Sequelizemature, model-basedWidely used, lots of features; typing historically weaker.
TypeORMdecorator/entity basedActive Record or Data Mapper; TS-native; some rough edges.
MongooseODM for MongoDBSchemas/validation on top of a schemaless DB.

Trade-offs:

  • Pros: productivity, migrations, validation, type safety (Prisma especially), protection from injection via parameterization.
  • Cons: abstraction can generate inefficient SQL, hide the N+1 problem, and make complex queries awkward — you sometimes drop to raw SQL anyway.

Guidance: Prisma is a strong default for new TypeScript apps (types + DX). Use a query builder (Knex) when you want SQL control with less magic, and the raw driver for hot paths or very complex queries. Whatever you pick, understand the SQL it generates.

Answer: Establishing a database connection involves a TCP handshake, authentication, and setup — too slow to do per request. A connection pool maintains a set of open connections that are borrowed for a query and returned for reuse.

How it works:

  • The pool opens up to max connections.
  • A query acquires an idle connection, runs, and releases it back.
  • If all are busy, requests queue until one frees up (or time out).
const { Pool } = require('pg');
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,                    // max connections
  idleTimeoutMillis: 30000,   // close idle ones
  connectionTimeoutMillis: 5000,
});

// Reuse the pool for every query — do NOT create a client per request
const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

Why it matters:

  • Performance: amortizes connection setup across many queries.
  • Resource protection: databases have a hard connection limit (e.g., Postgres default ~100). Without a cap, a spike could open thousands of connections and crash the DB.
  • Concurrency control: the pool naturally throttles DB load.

Common pitfalls:

  • Creating a new pool/client per request — defeats pooling and exhausts connections. Create one shared pool at startup.
  • Not releasing a manually-acquired client (e.g., forgetting client.release() in a finally) — leaks connections until the pool is exhausted and requests hang.
  • Sizing: with multiple app instances, total connections = pool size × instances — size against the DB's limit. For serverless (many short-lived instances), use an external pooler like PgBouncer or a serverless driver.

Answer: A transaction makes several operations succeed or fail together, preserving data integrity. It provides ACID guarantees: Atomicity, Consistency, Isolation, Durability.

Classic example — money transfer (both updates must apply or neither):

const client = await pool.connect();
try {
  await client.query('BEGIN');
  await client.query('UPDATE accounts SET balance = balance - $1 WHERE id = $2', [amt, from]);
  await client.query('UPDATE accounts SET balance = balance + $1 WHERE id = $2', [amt, to]);
  await client.query('COMMIT');
} catch (err) {
  await client.query('ROLLBACK');   // undo everything on any failure
  throw err;
} finally {
  client.release();                 // always return the connection
}

Key points:

  • Use one connection for the whole transaction — you can't spread BEGIN/COMMIT across pooled connections.
  • Always ROLLBACK on error and release() in finally.

With an ORM (Prisma):

await prisma.$transaction([
  prisma.account.update({ where: { id: from }, data: { balance: { decrement: amt } } }),
  prisma.account.update({ where: { id: to },   data: { balance: { increment: amt } } }),
]);

Isolation levels (READ COMMITTED, REPEATABLE READ, SERIALIZABLE) control what concurrent transactions can see, trading strictness for concurrency. Higher levels prevent anomalies (dirty/non-repeatable/phantom reads) but can cause serialization failures you must retry.

When you need them: any multi-step write that must be consistent — orders + inventory, transfers, "create user + profile + audit log." Skipping transactions here leads to partial writes and corrupted state.

Answer: Two of the most common database issues in Node apps — one security, one performance.

1. Injection — use parameterized queries: Never build queries by string interpolation with user input:

// ❌ SQL injection — input like "1; DROP TABLE users" is catastrophic
db.query(`SELECT * FROM users WHERE email = '${email}'`);

// ✅ Parameterized — the driver sends data separately from the query
db.query('SELECT * FROM users WHERE email = $1', [email]);
  • Parameterization ensures input is treated as data, never executable SQL.
  • ORMs/query builders parameterize by default — but their raw escape hatches ($queryRawUnsafe, string concatenation) reintroduce the risk.
  • NoSQL is also vulnerable (NoSQL injection): validate/cast input so an attacker can't pass an object like { "$gt": "" } where you expect a string.

2. The N+1 query problem: Fetching a list, then querying related data per item, produces 1 + N queries:

// ❌ N+1: 1 query for posts, then 1 per post for its author
const posts = await getPosts();               // 1 query
for (const p of posts) {
  p.author = await getUser(p.authorId);       // N queries
}

Fix by loading related data in one go:

// ✅ Single JOIN
SELECT posts.*, users.name FROM posts JOIN users ON users.id = posts.author_id;

// ✅ ORM eager-loading (Prisma)
const posts = await prisma.post.findMany({ include: { author: true } });

// ✅ Batch the lookups
const authors = await getUsersByIds(posts.map(p => p.authorId)); // 1 query
  • In GraphQL, use DataLoader to batch and cache per-request lookups.

Why it matters: N+1 turns a fast endpoint into hundreds of round-trips under load; injection is a top OWASP risk that can leak or destroy the whole database.