What are streams in Node, and what are the four types?

Streams process data in chunks over time instead of loading it all into memory. The four types are Readable (source, e.g. fs.createReadStream), Writable (sink, e.g. HTTP response), Duplex (both, e.g. a TCP socket), and Transform (a Duplex that modifies data, e.g. zlib gzip).

What is backpressure, and how do streams handle it?

Backpressure is when a fast Readable produces data quicker than a slow Writable can consume it. `write()` returns false when the internal buffer (highWaterMark) is full; you should pause until a 'drain' event. `pipe()` and `pipeline()` handle this automatically, which is why they're preferred over manual data/write loops.

What is a Buffer, and when do you use one?

A Buffer is a fixed-length chunk of raw binary memory outside V8's heap, used to handle binary data — file contents, network packets, crypto, image bytes. It's a subclass of Uint8Array with encoding helpers (utf8, hex, base64) to convert to/from strings.

Explain flowing vs paused mode for readable streams. How do you consume a stream?

A Readable stream is in paused mode by default; you pull data with .read() or switch to flowing mode by attaching a 'data' listener or calling .pipe(), where chunks are pushed automatically. Modern code often consumes streams with for-await-of, which handles flow and backpressure cleanly.

When should you use streams instead of reading whole files or buffering data?

Use streams when data is large or unbounded, or when you want to start processing before all data arrives. Streaming keeps memory usage roughly constant and reduces latency, whereas reading everything into a Buffer/string scales memory with payload size and can exhaust RAM under load.

Streams & Buffers

Processing data efficiently — the four stream types, backpressure and piping, Buffers for binary data, flowing vs paused mode, and why streaming beats loading everything into memory.

Difficulty

Open as page

Answer: A stream is an abstraction for reading or writing data incrementally — piece by piece — rather than holding the whole payload in memory. Streams are EventEmitters.

The four types:

Type	Direction	Examples
Readable	source you read from	`fs.createReadStream`, HTTP request, `process.stdin`
Writable	sink you write to	`fs.createWriteStream`, HTTP response, `process.stdout`
Duplex	both, independent	TCP socket (`net.Socket`)
Transform	Duplex that transforms input→output	`zlib.createGzip`, `crypto` cipher streams

Reading and writing:

const fs = require('fs');
const rs = fs.createReadStream('input.txt');
const ws = fs.createWriteStream('output.txt');

rs.on('data', chunk => ws.write(chunk));
rs.on('end', () => ws.end());
rs.on('error', err => console.error(err));

The idiomatic version — pipe:

fs.createReadStream('input.txt')
  .pipe(zlib.createGzip())          // Transform
  .pipe(fs.createWriteStream('input.txt.gz'));

Why streams matter: they keep memory usage constant and low regardless of file/response size and start producing output before all input has arrived (lower latency). They're everywhere in Node — HTTP bodies, file I/O, compression, crypto.

Open as page

Answer: Backpressure occurs when a data source outpaces the destination — e.g., reading a file from a fast SSD and writing to a slow network socket. Without handling it, data piles up in memory and can exhaust it.

How Writable streams signal it:

writable.write(chunk) returns false when the internal buffer has exceeded its highWaterMark (default 16 KB for byte streams).
When you get false, you should stop writing and wait for the 'drain' event before resuming.

Manual handling (illustrative):

readable.on('data', (chunk) => {
  const ok = writable.write(chunk);
  if (!ok) {
    readable.pause();                 // stop reading
    writable.once('drain', () => readable.resume()); // resume when drained
  }
});

The right way — let Node manage it:

const { pipeline } = require('stream/promises');

await pipeline(
  fs.createReadStream('huge.log'),
  zlib.createGzip(),
  fs.createWriteStream('huge.log.gz')
);
// pipeline handles backpressure AND propagates errors + cleans up

pipe() and pipeline() automatically pause/resume the source based on the destination's readiness.
pipeline() additionally forwards errors and destroys all streams on failure (avoiding leaks) — prefer it over pipe() for anything beyond trivial cases.

Interview point: backpressure is the reason to use pipe/pipeline instead of hand-rolling on('data') + write(); manual loops without drain handling leak memory under load.

Open as page

Answer: A Buffer represents a fixed-length sequence of raw bytes. Because JavaScript strings are UTF-16 and not suited to arbitrary binary data, Node uses Buffers for anything binary: file I/O, TCP packets, cryptography, image/video bytes, protocol parsing.

Creating Buffers:

Buffer.from('hello', 'utf8');      // from a string
Buffer.from([0x68, 0x69]);         // from bytes
Buffer.alloc(10);                  // 10 zero-filled bytes (safe)
Buffer.allocUnsafe(10);            // faster, but may contain old memory — overwrite before use

Encoding conversions:

const buf = Buffer.from('hello');
buf.toString('utf8');   // 'hello'
buf.toString('hex');    // '68656c6c6f'
buf.toString('base64'); // 'aGVsbG8='

Key facts:

A Buffer is a subclass of Uint8Array, so TypedArray methods work on it.
It's allocated outside the V8 heap (in C++), so large Buffers don't pressure V8's garbage collector the same way.
Fixed size — you can't grow a Buffer; you allocate a new one or use a stream.

Security note: prefer Buffer.alloc over Buffer.allocUnsafe. allocUnsafe skips zero-filling for speed and can expose leftover memory contents if you read before fully writing it.

When you use Buffers directly: implementing binary protocols, hashing/encrypting bytes, manipulating image data, or reading a file's raw bytes. For text you usually just specify an encoding and work with strings.

Open as page

Answer: A Readable stream operates in one of two modes governing how data moves.

Paused mode (default): you explicitly pull data:

readable.on('readable', () => {
  let chunk;
  while ((chunk = readable.read()) !== null) {
    process(chunk);
  }
});

Flowing mode: data is pushed to you as fast as it arrives. You enter it by:

attaching a 'data' listener,
calling .pipe(), or
calling .resume().

readable.on('data', chunk => process(chunk)); // now flowing

Switching: adding a 'data' handler or pipe() → flowing; .pause() → paused; removing pipes/handlers can pause again.

Modern, preferred approach — async iteration:

async function readAll(readable) {
  let total = 0;
  for await (const chunk of readable) {   // handles flow + backpressure
    total += chunk.length;
  }
  return total;
}

for await...of is the cleanest way to consume a stream: it respects backpressure, propagates errors as exceptions (usable with try/catch), and reads until the stream ends.

Gotcha: In flowing mode, if you attach a 'data' listener but the consumer is slow and you don't manage backpressure, memory can grow. pipe/pipeline/for await avoid this; a bare 'data' loop does not.

Open as page

Answer:

Buffering (read it all at once):

const data = await fs.promises.readFile('report.csv'); // whole file in memory

Simple and fine for small, bounded data. But memory usage = file size × concurrent requests, so a 1 GB file (or many medium ones) can OOM the process.

Streaming (process chunk by chunk):

fs.createReadStream('report.csv')
  .pipe(csvParser())
  .pipe(transformRows())
  .pipe(res); // send to the client as you go

Use streams when:

Large or unknown-size data — big files, uploads/downloads, DB exports, log processing.
Constant memory matters — memory stays around the buffer size regardless of total volume.
Lower latency / TTFB — you can start sending/processing before all input is read (e.g., piping a file straight to an HTTP response).
Composable pipelines — chain read → decompress → parse → transform → write.

Stick with buffering when:

Data is small and you need the whole thing at once (e.g., parse a small JSON config).
The processing genuinely requires random access to all the data.

Real-world example: serving a large file download should createReadStream(...).pipe(res), not readFile then res.send, so one big download doesn't spike memory for every concurrent client.