What is a SecurityContext, and what does it control?
Quick Answer
A SecurityContext (settable at the Pod level, applying to all its containers, and/or overridden per-container) configures Linux-level security settings for how a container actually runs — whether it runs as root or a specific non-root user ID, whether it can escalate privileges, which Linux capabilities it has beyond the container default set, and whether its root filesystem is read-only. These settings are a core part of hardening containers against exploitation, since a container running as root with unnecessary privileges gives an attacker who compromises it a much larger blast radius.
Detailed Answer
A hardened example
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
securityContext: # Pod-level: applies to all containers by default
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: app
image: myapp:1.0
securityContext: # container-level: can override the Pod-level settings
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE # only add back the specific capability actually needed
Key settings and what each one hardens against
runAsNonRoot/runAsUser— many container images default to running as root (UID 0) inside the container unless told otherwise; forcing a non-root UID means that even if an attacker achieves code execution inside the container, they don't automatically have root-level privileges within that container's own namespace, limiting what they can further tamper with (though container root is still not equivalent to host root, given proper isolation — this is defense in depth, not the only layer).allowPrivilegeEscalation: false— prevents a process from gaining more privileges than its parent process had (blocking, among other things, setuid binaries from escalating privilege inside the container) — a meaningful hardening step against a specific class of container escape/privilege-escalation technique.readOnlyRootFilesystem: true— makes the container's own root filesystem immutable at runtime; an attacker who achieves code execution can't write a persistent backdoor or modify application binaries on disk, though the application must then explicitly mount a writable volume (likeemptyDir) for any directory it legitimately needs to write to (temp files, caches).capabilities: drop: [ALL], then selectivelyadd— Linux capabilities are fine-grained permissions that break up what used to be the monolithic "root" privilege (e.g.,NET_BIND_SERVICEfor binding to ports below 1024,SYS_ADMINfor a wide range of administrative operations). Dropping all capabilities and adding back only the specific ones a container genuinely needs is a direct application of least privilege at the kernel-capability level — most containers need zero or very few capabilities beyond the default set the runtime already restricts.fsGroup— sets the group ownership of mounted volumes, letting a non-root user still have appropriate write access to volume-backed storage without needing to run as root.
Why this matters: limiting the blast radius of a compromised container
Container isolation (namespaces, cgroups) already provides real separation from the host, but it's not an absolute security boundary — container escape vulnerabilities do periodically get discovered, and a poorly-hardened container (running as root, with unnecessary Linux capabilities, a writable root filesystem, and unrestricted privilege escalation) gives an attacker who achieves code execution inside it a much larger set of tools to work with than a properly hardened one. SecurityContext settings are exactly the mechanism for closing off unnecessary privilege a container was never going to legitimately need.
Enforcing this cluster-wide, not just per-Pod
Rather than relying on every team to remember to set these fields correctly in every Pod spec, most clusters enforce baseline SecurityContext requirements cluster-wide (or per-namespace) via Pod Security Admission (see that question) — rejecting or flagging Pods that don't meet a minimum security bar (e.g., the restricted Pod Security Standard requires most of the settings shown above) rather than trusting every individual manifest to have gotten it right voluntarily.