What is the GIL (Global Interpreter Lock), and why does it exist?
Quick Answer
The GIL is a mutex in CPython that allows only **one thread to execute Python bytecode at a time**, even on a multi-core machine. It exists because CPython's memory management (reference counting) isn't thread-safe by default, and the GIL was the simplest way to make the interpreter thread-safe without requiring fine-grained locking on every object. The practical consequence: pure-Python CPU-bound code doesn't get faster with more threads; I/O-bound code still benefits because the GIL is released during blocking I/O.
Detailed Answer
What the GIL actually locks
CPython's memory management relies on reference counting: every object tracks how many references point to it, and is freed when that count hits zero. Incrementing/decrementing a refcount from multiple threads simultaneously, without synchronization, is a data race that could corrupt an object's refcount (leading to premature frees or memory leaks). The GIL solves this crudely but effectively: only one thread runs Python bytecode at a time, so refcount updates are never actually concurrent.
import threading
counter = 0
def increment():
global counter
for _ in range(1_000_000):
counter += 1
threads = [threading.Thread(target=increment) for _ in range(4)]
[t.start() for t in threads]
[t.join() for t in threads]
print(counter) # 4,000,000 -- correct, thanks to the GIL serializing bytecode execution
Without the GIL (or equivalent fine-grained locking), this kind of shared counter update from multiple threads would risk lost updates.
Why "more threads" doesn't mean "more CPU throughput"
def cpu_bound(n):
return sum(i * i for i in range(n))
# Running cpu_bound() on 4 threads doesn't run 4x faster --
# only one thread executes Python bytecode at any instant, GIL or not.
For CPU-bound pure-Python work, threads provide concurrency (multiple things making progress, interleaved) but not parallelism (multiple things running simultaneously on separate cores) — the GIL serializes bytecode execution regardless of how many OS threads and CPU cores exist.
Why threading still helps for I/O-bound work
import time
def slow_io():
time.sleep(1) # releases the GIL while "blocked"
Blocking operations that call into C (file/network I/O, time.sleep,
many library calls) release the GIL while waiting, letting other
Python threads run bytecode in the meantime. This is why
threading/concurrent.futures.ThreadPoolExecutor genuinely speed up
I/O-bound workloads (e.g., many concurrent HTTP requests) even though the
GIL exists — the bottleneck (waiting on the network) isn't CPU work at all.
The real workaround for CPU-bound parallelism: separate processes
Since the GIL is per-interpreter process, multiprocessing sidesteps
it entirely by running separate Python processes, each with its own GIL,
achieving true multi-core parallelism for CPU-bound work at the cost of
inter-process communication overhead (data must be pickled/copied between
processes, not shared directly).
PEP 703: free-threaded (no-GIL) Python
Starting with Python 3.13, an experimental free-threaded build
(python3.13t) removes the GIL, using more fine-grained locking instead —
aiming to give real multi-core parallelism to threaded Python code. As of
this writing it's still opt-in and the ecosystem (C extensions especially)
is still adapting; the standard GIL-enabled build remains the default.
Interview-ready summary: The GIL is CPython's mutex ensuring only one
thread executes Python bytecode at a time, needed because refcount-based
memory management isn't otherwise thread-safe. It doesn't prevent
threading from helping I/O-bound work (the GIL is released during
blocking calls), but it does prevent threads from speeding up CPU-bound
pure-Python code — for that, use multiprocessing, or Python 3.13+'s
experimental free-threaded build.