SQL vs NoSQL — how do you choose for a Node.js application?

Choose SQL (PostgreSQL, MySQL) when you have structured, relational data needing consistency, joins, and transactions. Choose NoSQL (MongoDB, etc.) for flexible/evolving schemas, denormalized documents, or horizontal scale of simple access patterns. The decision is driven by data shape, consistency needs, and query/access patterns — not language.

What is an ORM or query builder? Compare Prisma, Sequelize, TypeORM, and the raw driver.

An ORM maps database rows to JavaScript objects and generates SQL for you; a query builder gives a programmatic API to compose SQL without full object mapping. They trade some control and performance for productivity, type safety, and migrations. The raw driver gives maximum control but the most boilerplate.

What is connection pooling and why does it matter in Node?

Opening a DB connection is expensive (TCP + auth handshake). A pool keeps a set of reusable open connections that requests borrow and return, avoiding per-request connection cost and capping concurrent connections so you don't overwhelm the database. In Node you configure a single shared pool, not a connection per query.

How do transactions work, and how do you use them in Node?

A transaction groups multiple statements into an all-or-nothing unit (BEGIN → COMMIT, or ROLLBACK on error), giving ACID guarantees. In Node you acquire a single connection, run the statements on it, and commit or roll back — critical for multi-step operations like transfers where partial success would corrupt data.

How do you prevent injection attacks and the N+1 query problem?

Prevent injection by never concatenating user input into queries — use parameterized queries/prepared statements (or ORM methods that parameterize). Fix N+1 (one query per item in a loop) by eager-loading related data with a join or a single batched query (ORM include/JOIN or DataLoader).

How do you model data and enforce schemas with Mongoose/MongoDB?

MongoDB stores schemaless BSON documents; Mongoose adds an application-level schema layer with types, validation, defaults, and middleware. You model data around access patterns — embedding related data for read-together documents, referencing for large or independently-queried data.

Databases & ORMs

Persisting data from Node — choosing SQL vs NoSQL, ORMs and query builders (Prisma, Sequelize, TypeORM, Mongoose), connection pooling, transactions, and avoiding injection and N+1 queries.

Difficulty

Open as page

Answer: The choice depends on your data shape, consistency requirements, and access patterns, not on Node itself (Node has mature drivers for both).

Relational / SQL (PostgreSQL, MySQL):

Structured data with clear relationships; enforced schema.
Strong ACID transactions and joins.
Great when data integrity and complex queries matter (finance, orders, anything relational).

SELECT u.name, COUNT(o.id) AS orders
FROM users u JOIN orders o ON o.user_id = u.id
GROUP BY u.id;

Document / NoSQL (MongoDB):

Flexible, evolving schemas; store denormalized documents matching how you read them.
Horizontal scaling and high write throughput for simple access patterns.
Good for content, catalogs, event logs, rapidly changing shapes.

db.orders.find({ userId, status: 'paid' });

Decision factors:

Factor	Lean SQL	Lean NoSQL
Data structure	relational, uniform	flexible, nested, varied
Transactions/joins	needed	rarely needed
Schema stability	stable	evolving fast
Scale pattern	vertical + read replicas	horizontal sharding
Query complexity	complex, ad-hoc	known, document-shaped

Reality check:

Postgres now has strong JSONB support, so "I need flexible fields" isn't automatically NoSQL.
Many systems use both (polyglot persistence): Postgres for core entities, Redis for cache, a document/search store for specific needs.
Pick based on the dominant access pattern; don't choose NoSQL just to avoid schema design.

Open as page

Answer:

The spectrum, from most control to most abstraction:

Raw driver (pg, mysql2, mongodb) — you write SQL/queries yourself.
Query builder (Knex) — compose SQL with a fluent JS API.
ORM (Prisma, Sequelize, TypeORM, Mongoose) — map tables/collections to objects/models, generate queries, manage migrations.

Raw driver:

const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

Max control and performance, but lots of boilerplate and manual mapping.

ORMs compared:

Tool	Style	Notes
Prisma	schema-first, generated client	Excellent TypeScript types, great DX, explicit migrations; less flexible for very complex/raw SQL (has `$queryRaw` escape hatch).
Sequelize	mature, model-based	Widely used, lots of features; typing historically weaker.
TypeORM	decorator/entity based	Active Record or Data Mapper; TS-native; some rough edges.
Mongoose	ODM for MongoDB	Schemas/validation on top of a schemaless DB.

Trade-offs:

Pros: productivity, migrations, validation, type safety (Prisma especially), protection from injection via parameterization.
Cons: abstraction can generate inefficient SQL, hide the N+1 problem, and make complex queries awkward — you sometimes drop to raw SQL anyway.

Guidance: Prisma is a strong default for new TypeScript apps (types + DX). Use a query builder (Knex) when you want SQL control with less magic, and the raw driver for hot paths or very complex queries. Whatever you pick, understand the SQL it generates.

Open as page

Answer: Establishing a database connection involves a TCP handshake, authentication, and setup — too slow to do per request. A connection pool maintains a set of open connections that are borrowed for a query and returned for reuse.

How it works:

The pool opens up to max connections.
A query acquires an idle connection, runs, and releases it back.
If all are busy, requests queue until one frees up (or time out).

const { Pool } = require('pg');
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,                    // max connections
  idleTimeoutMillis: 30000,   // close idle ones
  connectionTimeoutMillis: 5000,
});

// Reuse the pool for every query — do NOT create a client per request
const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

Why it matters:

Performance: amortizes connection setup across many queries.
Resource protection: databases have a hard connection limit (e.g., Postgres default ~100). Without a cap, a spike could open thousands of connections and crash the DB.
Concurrency control: the pool naturally throttles DB load.

Common pitfalls:

Creating a new pool/client per request — defeats pooling and exhausts connections. Create one shared pool at startup.
Not releasing a manually-acquired client (e.g., forgetting client.release() in a finally) — leaks connections until the pool is exhausted and requests hang.
Sizing: with multiple app instances, total connections = pool size × instances — size against the DB's limit. For serverless (many short-lived instances), use an external pooler like PgBouncer or a serverless driver.

Open as page

Answer: A transaction makes several operations succeed or fail together, preserving data integrity. It provides ACID guarantees: Atomicity, Consistency, Isolation, Durability.

Classic example — money transfer (both updates must apply or neither):

const client = await pool.connect();
try {
  await client.query('BEGIN');
  await client.query('UPDATE accounts SET balance = balance - $1 WHERE id = $2', [amt, from]);
  await client.query('UPDATE accounts SET balance = balance + $1 WHERE id = $2', [amt, to]);
  await client.query('COMMIT');
} catch (err) {
  await client.query('ROLLBACK');   // undo everything on any failure
  throw err;
} finally {
  client.release();                 // always return the connection
}

Key points:

Use one connection for the whole transaction — you can't spread BEGIN/COMMIT across pooled connections.
Always ROLLBACK on error and release() in finally.

With an ORM (Prisma):

await prisma.$transaction([
  prisma.account.update({ where: { id: from }, data: { balance: { decrement: amt } } }),
  prisma.account.update({ where: { id: to },   data: { balance: { increment: amt } } }),
]);

Isolation levels (READ COMMITTED, REPEATABLE READ, SERIALIZABLE) control what concurrent transactions can see, trading strictness for concurrency. Higher levels prevent anomalies (dirty/non-repeatable/phantom reads) but can cause serialization failures you must retry.

When you need them: any multi-step write that must be consistent — orders + inventory, transfers, "create user + profile + audit log." Skipping transactions here leads to partial writes and corrupted state.

Open as page

Answer: Two of the most common database issues in Node apps — one security, one performance.

1. Injection — use parameterized queries: Never build queries by string interpolation with user input:

// ❌ SQL injection — input like "1; DROP TABLE users" is catastrophic
db.query(`SELECT * FROM users WHERE email = '${email}'`);

// ✅ Parameterized — the driver sends data separately from the query
db.query('SELECT * FROM users WHERE email = $1', [email]);

Parameterization ensures input is treated as data, never executable SQL.
ORMs/query builders parameterize by default — but their raw escape hatches ($queryRawUnsafe, string concatenation) reintroduce the risk.
NoSQL is also vulnerable (NoSQL injection): validate/cast input so an attacker can't pass an object like { "$gt": "" } where you expect a string.

2. The N+1 query problem: Fetching a list, then querying related data per item, produces 1 + N queries:

// ❌ N+1: 1 query for posts, then 1 per post for its author
const posts = await getPosts();               // 1 query
for (const p of posts) {
  p.author = await getUser(p.authorId);       // N queries
}

Fix by loading related data in one go:

// ✅ Single JOIN
SELECT posts.*, users.name FROM posts JOIN users ON users.id = posts.author_id;

// ✅ ORM eager-loading (Prisma)
const posts = await prisma.post.findMany({ include: { author: true } });

// ✅ Batch the lookups
const authors = await getUsersByIds(posts.map(p => p.authorId)); // 1 query

In GraphQL, use DataLoader to batch and cache per-request lookups.

Why it matters: N+1 turns a fast endpoint into hundreds of round-trips under load; injection is a top OWASP risk that can leak or destroy the whole database.

Open as page

Answer: MongoDB itself is schemaless — documents in a collection can differ. Mongoose is an ODM that imposes structure, validation, and convenience at the application level.

Defining a schema and model:

const { Schema, model } = require('mongoose');

const userSchema = new Schema({
  email: { type: String, required: true, unique: true, lowercase: true },
  name:  { type: String, required: true },
  role:  { type: String, enum: ['user', 'admin'], default: 'user' },
  createdAt: { type: Date, default: Date.now },
});

const User = model('User', userSchema);
await User.create({ email: 'a@b.com', name: 'Alice' });

Mongoose enforces types, required, enum, unique (via an index), defaults, and custom validators — and can run middleware/hooks (e.g., hash a password on pre('save')).

Modeling: embed vs reference — driven by access patterns:

Embed related data you read together and that's bounded:

// An order with its line items embedded — fetched in one read
{ _id, userId, items: [{ sku, qty, price }], total }

Reference when data is large, shared, or queried independently:

const postSchema = new Schema({ author: { type: Schema.Types.ObjectId, ref: 'User' } });
const posts = await Post.find().populate('author'); // join-like lookup

Trade-offs:

Embedding → fewer queries, atomic updates of the whole document, but risks unbounded growth and duplication.
Referencing → normalized, but needs populate/lookups (watch for N+1).

Key idea: in document databases you model around how you read the data, often denormalizing for read performance — the opposite instinct from normalized SQL design.