How do you connect to and query databases from Python?
Quick Answer
Low-level access goes through the **DB-API 2.0** standard (PEP 249) — every major database driver (`psycopg2`/`psycopg` for Postgres, `sqlite3` built-in, `pymysql`) exposes the same `connect()`/`cursor()`/`execute()`/`fetchall()` interface. Most applications instead use an **ORM** (SQLAlchemy, Django ORM) to work with Python objects instead of raw SQL, at the cost of an abstraction layer — always use **parameterized queries**, never string-formatted SQL, to avoid SQL injection.
Detailed Answer
The DB-API 2.0 standard: a common low-level interface
import sqlite3
conn = sqlite3.connect("app.db")
cursor = conn.cursor()
cursor.execute("SELECT id, name FROM users WHERE age > ?", (18,))
rows = cursor.fetchall()
conn.close()
Every DB-API-compliant driver (sqlite3 built-in, psycopg2/psycopg
for PostgreSQL, pymysql/mysqlclient for MySQL) exposes the same
shape: connect() returns a connection, .cursor() gets a cursor,
.execute(sql, params) runs a query, and .fetchall()/.fetchone()
retrieve results — learning this pattern once transfers across database
backends.
SQL injection: the critical vulnerability to avoid
# NEVER do this -- string formatting builds SQL from untrusted input
name = "'; DROP TABLE users; --"
cursor.execute(f"SELECT * FROM users WHERE name = '{name}'") # SQL INJECTION!
# ALWAYS use parameterized queries -- the driver handles escaping safely
cursor.execute("SELECT * FROM users WHERE name = ?", (name,)) # safe, regardless of content
String-interpolating user input directly into SQL lets an attacker
inject arbitrary SQL (as the classic "Bobby Tables" example shows) —
parameterized queries (? or %s placeholders, driver-dependent syntax)
send the query and its values separately to the database, which
handles escaping correctly regardless of what the value contains. This
is not optional hardening — it's the baseline requirement for any code
that builds a query using data from outside the program.
ORMs: working with objects instead of raw SQL
from sqlalchemy import create_engine, select
from sqlalchemy.orm import Session, DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str]
age: Mapped[int]
engine = create_engine("postgresql://localhost/app")
with Session(engine) as session:
users = session.scalars(select(User).where(User.age > 18)).all()
SQLAlchemy (and Django's built-in ORM) let you query and manipulate data as Python objects/classes instead of writing raw SQL strings, and automatically parameterize values (so ORM queries are inherently safe from SQL injection for their generated queries). The tradeoff: an abstraction layer that can generate inefficient queries if misused (the classic N+1 query problem — fetching a list, then separately querying related data for each item in a loop) and a learning curve of its own.
Connection pooling for production
engine = create_engine("postgresql://localhost/app", pool_size=10, max_overflow=5)
Opening a new database connection per request is expensive; production applications use a connection pool (built into SQLAlchemy's engine, or a standalone pooler like PgBouncer for Postgres) that reuses a fixed set of open connections across requests, dramatically reducing per-request connection overhead.
Async database access
import asyncpg
async def get_users(pool):
async with pool.acquire() as conn:
return await conn.fetch("SELECT * FROM users WHERE age > $1", 18)
Standard DB-API drivers are synchronous/blocking, which would stall an
asyncio event loop — async applications (FastAPI, etc.) use async-native
drivers (asyncpg, aiomysql) or an async-compatible ORM layer (SQLAlchemy's
async engine) instead.
Interview-ready summary: DB-API 2.0 gives a consistent low-level interface across database drivers; most applications build on an ORM (SQLAlchemy, Django ORM) for productivity, understanding the tradeoff of an abstraction layer that can hide inefficient query patterns like N+1. Regardless of layer, always use parameterized queries — never string-format untrusted input into SQL — to avoid SQL injection.