What is SQL injection, and how do you prevent it?
Quick Answer
SQL injection occurs when untrusted input is concatenated directly into a SQL string, allowing an attacker to alter the query's structure — e.g., adding `OR '1'='1'` to bypass a login check, or a `; DROP TABLE ...` to destroy data. It's prevented completely by using parameterized queries/prepared statements (never concatenating user input into SQL text), applying least-privilege database accounts, and defense-in-depth measures like input validation and ORMs that parameterize by default.
Detailed Answer
The vulnerability
# VULNERABLE
username = request.form["username"] # attacker-controlled input
query = f"SELECT * FROM users WHERE username = '{username}'"
If an attacker submits admin' -- as the username:
SELECT * FROM users WHERE username = 'admin' --'
The -- starts a SQL comment, so everything after it (including whatever password check might have followed) is ignored — this can bypass authentication entirely, returning the admin row regardless of password. A more destructive attacker payload like x'; DROP TABLE users; -- could delete data outright, depending on whether the driver/database allows statement stacking.
Why it works
The application intended username to be interpreted purely as data — a literal string value to compare against. But because it was concatenated directly into the SQL text, the database has no way to distinguish "data the developer meant as a literal value" from "SQL syntax" — the attacker's input is parsed as if the developer had typed it themselves.
The complete fix: parameterized queries / prepared statements
# SAFE
cursor.execute("SELECT * FROM users WHERE username = %s", (username,))
As covered in the prepared statements question, this sends the query structure and the parameter value as two separate, distinct channels — the database never re-parses the parameter value as SQL syntax, so there is no possible way for it to alter the query's structure, no matter what characters it contains. This isn't "better escaping" — it's a structurally different mechanism that eliminates the vulnerability class entirely for parameterized values.
Defense in depth (beyond the primary fix)
- Least privilege database accounts — the application's database user should have only the permissions it actually needs (e.g., no
DROP TABLEpermission for a web application's runtime account), limiting the blast radius even if some other injection vector is somehow missed. - Input validation — rejecting obviously malformed input early is good practice, but is not a substitute for parameterization; validation can be bypassed or incomplete, while parameterization is complete by construction.
- ORMs — most modern ORMs parameterize queries by default, which is a strong reason to prefer them (or a query builder) over hand-written string-concatenated SQL, though it's still possible to introduce injection through an ORM's "raw query" escape hatches if used carelessly.
- WAFs (Web Application Firewalls) — can catch some known injection patterns as an additional layer, but are pattern-matching heuristics, not a reliable primary defense.
What parameterization doesn't cover
Dynamic SQL structure (table names, column names, ORDER BY direction) chosen from user input can't be parameterized the same way — these must be validated against an explicit allow-list of known-safe values rather than ever directly interpolated, even though they "look like" simple identifiers rather than dangerous data.
A strong answer explains why parameterization works (separating code from data at the protocol level), not just "use prepared statements" as a memorized rule — and mentions least privilege as a complementary, not alternative, defense.