When should you use threading vs multiprocessing vs asyncio?

Detailed Answer

The decision framework

Workload	Best tool	Why
CPU-bound (heavy computation)	`multiprocessing`	Bypasses the GIL via separate processes — actual multi-core parallelism
I/O-bound, moderate concurrency (10s-100s)	`threading`	Simple to retrofit onto existing sync code; GIL releases during blocking I/O
I/O-bound, very high concurrency (1000s of connections)	`asyncio`	One thread, no per-task OS thread overhead; scales to far more concurrent tasks

Threading: easiest retrofit for I/O-bound code

from concurrent.futures import ThreadPoolExecutor
import requests

def fetch(url):
    return requests.get(url).status_code

with ThreadPoolExecutor(max_workers=10) as pool:
    results = list(pool.map(fetch, urls))

Existing synchronous libraries (like requests) work unmodified inside threads — no need to rewrite calls as async/await. Downside: each thread has real OS overhead (~MBs of stack space each), so this doesn't scale gracefully to tens of thousands of concurrent tasks.

Multiprocessing: real parallelism for CPU-bound work

from concurrent.futures import ProcessPoolExecutor

def cpu_heavy(n):
    return sum(i * i for i in range(n))

with ProcessPoolExecutor() as pool:
    results = list(pool.map(cpu_heavy, [10**7] * 4))   # genuinely runs on 4 cores

Each process has its own interpreter and GIL, so cpu_heavy genuinely runs in parallel across cores — at the cost of process startup overhead and needing to pickle data across the process boundary (no shared memory by default).

Asyncio: massive I/O concurrency, single thread

import asyncio
import aiohttp

async def fetch(session, url):
    async with session.get(url) as resp:
        return resp.status

async def main(urls):
    async with aiohttp.ClientSession() as session:
        return await asyncio.gather(*(fetch(session, u) for u in urls))

asyncio.run(main(urls))   # can comfortably handle thousands of concurrent requests

A single thread cooperatively switches between thousands of pending coroutines whenever one is waiting on I/O — no OS thread per task, so memory/scheduling overhead per concurrent task is far lower than threading. The catch: it requires an async-compatible library stack (aiohttp instead of requests, asyncpg instead of a blocking DB driver) — mixing in a blocking call anywhere freezes the entire event loop, not just one task.

Combining them

It's common to combine approaches: use asyncio for I/O concurrency, and delegate genuinely CPU-bound chunks of work to a ProcessPoolExecutor via loop.run_in_executor(...) so they don't block the event loop.

Interview-ready summary: Pick multiprocessing for CPU-bound parallelism (the GIL makes threads useless for this), threading for moderate I/O concurrency with minimal code changes, and asyncio when you need very high I/O concurrency and are willing to adopt an async library stack throughout.