What's the difference between a stored procedure and a user-defined function?
Quick Answer
A **function** must return a value (scalar or table) and, in most engines, cannot perform transaction control (`COMMIT`/`ROLLBACK`) or arbitrary side effects — it's meant to be called inside a `SELECT` like any built-in function. A **stored procedure** doesn't have to return anything (or can return multiple result sets/output parameters), can manage transactions itself, and is invoked with `CALL`/`EXEC` rather than embedded inside a query expression.
Detailed Answer
Function — used inside expressions, must return a value
CREATE FUNCTION get_full_name(first_name TEXT, last_name TEXT)
RETURNS TEXT
LANGUAGE sql
IMMUTABLE
AS $$
SELECT first_name || ' ' || last_name;
$$;
SELECT get_full_name(first_name, last_name) FROM employees; -- used directly in SELECT
Procedure — invoked standalone, not embedded in a query
CREATE PROCEDURE archive_old_orders(cutoff_date DATE)
LANGUAGE plpgsql
AS $$
BEGIN
INSERT INTO orders_archive SELECT * FROM orders WHERE order_date < cutoff_date;
DELETE FROM orders WHERE order_date < cutoff_date;
COMMIT; -- procedures can manage their own transaction in engines that allow it
END;
$$;
CALL archive_old_orders('2020-01-01'); -- can't be used inside a SELECT
Key differences
| Function | Procedure | |
|---|---|---|
| Must return a value | Yes | No (can return nothing, out parameters, or multiple result sets) |
Callable inside SELECT/expressions | Yes | No — invoked with CALL/EXEC |
Can control transactions (COMMIT/ROLLBACK) | Generally no | Yes, in engines that support it (PostgreSQL, SQL Server) |
| Typical use | Computing/transforming a value, encapsulating reusable expressions | Multi-step business operations, batch jobs, administrative tasks |
Why the transaction-control distinction matters
Because a function is meant to be composable inside an arbitrary query (potentially called many times per row, or nested inside other expressions), most engines forbid functions from committing or rolling back — doing so mid-query would make no sense, since the calling query itself is part of some outer transaction context it doesn't control. Procedures, invoked as standalone top-level statements, don't have that constraint and can legitimately manage a whole multi-step transaction internally.
Terminology varies by engine
Older MySQL versions only had "stored procedures" (no true standalone functions distinct from procedures in the same sense); PostgreSQL added genuine standalone PROCEDURE objects (separate from FUNCTION) only in version 11. Always check what your specific engine/version actually supports before assuming feature parity.