What are the security risks of `pickle`, `eval`, and `exec`, and how do you avoid them?
Quick Answer
`pickle.loads()` on untrusted data can execute **arbitrary code** during deserialization (a pickled object's `__reduce__` can call any function), making it as dangerous as `eval()` on untrusted input — never unpickle data from a source you don't fully trust. `eval`/`exec` run arbitrary Python source directly and should essentially never be used on user-supplied input; use `json`/`ast.literal_eval` for safe parsing, and safer serialization formats (JSON, `msgpack`, protobuf) for data exchange across a trust boundary.
Detailed Answer
Why unpickling untrusted data is a full remote-code-execution risk
import pickle
import os
class Exploit:
def __reduce__(self):
return (os.system, ("echo pwned; rm -rf /tmp/demo",))
payload = pickle.dumps(Exploit())
# Anywhere this runs on untrusted input:
pickle.loads(payload) # actually executes os.system(...) during unpickling!
__reduce__ is a legitimate protocol pickle uses to know how to
reconstruct an object — but it can name any callable, and unpickling
calls it. There is no way to "sandbox" pickle.loads() against a
maliciously crafted payload; the official docs state plainly: never
unpickle data received from an untrusted or unauthenticated source.
This is the reason cache backends, message queues, or APIs that use
pickle for convenience are a known attack surface if any external input
can reach them.
eval()/exec(): running arbitrary source directly
user_input = "__import__('os').system('rm -rf /')"
eval(user_input) # executes it -- catastrophic if user_input is attacker-controlled
eval (expressions) and exec (statements) execute Python source text
directly, with the full power of the language — passing any
externally-influenced string to either is effectively giving that input
author full code execution in your process.
The safe alternatives
import json
import ast
# Safe: parsing structured data
data = json.loads(user_json_string) # only produces JSON-compatible values
# Safe: parsing a Python LITERAL (not arbitrary code)
value = ast.literal_eval("[1, 2, {'a': True}]") # only literals -- no function calls, no imports
ast.literal_eval("__import__('os').system('x')") # raises ValueError -- not a literal, rejected
json.loads only ever produces plain data (dicts, lists, strings,
numbers, booleans, None) — it cannot execute anything.
ast.literal_eval is a genuinely safe, restricted subset of eval that
parses only Python literals (numbers, strings, tuples, lists, dicts,
booleans, None) and explicitly rejects anything resembling a function
call or attribute access.
For serialization across a trust boundary, avoid pickle entirely
# Instead of pickling to send data between services / store in a shared cache:
import json
data = json.dumps({"user_id": 1, "action": "login"})
# For richer/faster binary serialization with the same "no code execution" safety:
# msgpack, protobuf, or a schema-validated format (pydantic models -> JSON)
pickle is appropriate only for trusted, same-process or
same-organization data you fully control (e.g., caching your own
computed Python objects to local disk) — never for data crossing a trust
boundary (received from a network request, a third-party queue, user
uploads, or any source you don't fully control end to end).
Interview-ready summary: pickle.loads() on untrusted data is
equivalent to arbitrary code execution, because a crafted payload's
__reduce__ can invoke any callable during deserialization — never
unpickle data you don't fully trust. eval/exec on any
externally-influenced string is the same class of risk. Use json/
ast.literal_eval for safe parsing, and JSON/msgpack/protobuf instead of
pickle for any data that crosses a trust boundary.