Stored Procedures, Functions, and Triggers

Difficulty
CREATE PROCEDURE transfer_funds(
    IN from_account INT,
    IN to_account INT,
    IN amount NUMERIC
)
LANGUAGE plpgsql
AS $$
BEGIN
    UPDATE accounts SET balance = balance - amount WHERE id = from_account;
    UPDATE accounts SET balance = balance + amount WHERE id = to_account;

    IF (SELECT balance FROM accounts WHERE id = from_account) < 0 THEN
        RAISE EXCEPTION 'Insufficient funds';
    END IF;
END;
$$;

CALL transfer_funds(1, 2, 100);

Advantages

  • Fewer network round-trips. Multiple statements execute as one call instead of several separate queries from the application, each paying network latency — meaningful for logic with many small, dependent steps.
  • Logic co-located with data. Business rules enforced in a procedure apply no matter which application, script, or ad-hoc tool touches the database — you can't accidentally bypass validation by writing to the table through a different code path.
  • Reduced data transfer. Complex calculations happen where the data already lives, rather than pulling large intermediate result sets to the application tier just to compute something and write it back.
  • Precompiled execution plan (in some engines) — repeated calls can reuse a cached plan rather than re-parsing/re-planning every time, though modern query engines also cache plans for parameterized ad-hoc queries, narrowing this advantage.

Disadvantages

  • Weaker tooling for versioning/testing. Application code benefits from mature version control diffing, unit testing frameworks, code review tooling, and CI pipelines; stored procedure logic often lives partially outside that ecosystem unless deliberately integrated (migration-based deployment, dedicated SQL test frameworks).
  • Vendor lock-in. Procedural SQL dialects (PL/pgSQL, T-SQL, PL/SQL) are not portable across engines — logic written for PostgreSQL doesn't run on SQL Server without a rewrite, unlike application code written in a general-purpose language.
  • Split logic, harder to reason about. A developer reading application code may not realize significant business logic actually lives in the database, making the system harder to understand holistically and harder to debug with standard application debugging tools.
  • Scaling application logic independently of the database becomes harder — application-tier logic can scale horizontally across many stateless app servers; database-tier logic is bottlenecked by the database's own compute capacity.

Most modern application architectures favor keeping business logic in the application layer, reserving stored procedures for narrow cases where their advantages are decisive: enforcing an invariant that absolutely must never be bypassed regardless of caller, or a genuinely data-intensive operation where minimizing round-trips/data transfer matters more than tooling/portability concerns.

Function — used inside expressions, must return a value

CREATE FUNCTION get_full_name(first_name TEXT, last_name TEXT)
RETURNS TEXT
LANGUAGE sql
IMMUTABLE
AS $$
    SELECT first_name || ' ' || last_name;
$$;

SELECT get_full_name(first_name, last_name) FROM employees;   -- used directly in SELECT

Procedure — invoked standalone, not embedded in a query

CREATE PROCEDURE archive_old_orders(cutoff_date DATE)
LANGUAGE plpgsql
AS $$
BEGIN
    INSERT INTO orders_archive SELECT * FROM orders WHERE order_date < cutoff_date;
    DELETE FROM orders WHERE order_date < cutoff_date;
    COMMIT;   -- procedures can manage their own transaction in engines that allow it
END;
$$;

CALL archive_old_orders('2020-01-01');   -- can't be used inside a SELECT

Key differences

FunctionProcedure
Must return a valueYesNo (can return nothing, out parameters, or multiple result sets)
Callable inside SELECT/expressionsYesNo — invoked with CALL/EXEC
Can control transactions (COMMIT/ROLLBACK)Generally noYes, in engines that support it (PostgreSQL, SQL Server)
Typical useComputing/transforming a value, encapsulating reusable expressionsMulti-step business operations, batch jobs, administrative tasks

Why the transaction-control distinction matters

Because a function is meant to be composable inside an arbitrary query (potentially called many times per row, or nested inside other expressions), most engines forbid functions from committing or rolling back — doing so mid-query would make no sense, since the calling query itself is part of some outer transaction context it doesn't control. Procedures, invoked as standalone top-level statements, don't have that constraint and can legitimately manage a whole multi-step transaction internally.

Terminology varies by engine

Older MySQL versions only had "stored procedures" (no true standalone functions distinct from procedures in the same sense); PostgreSQL added genuine standalone PROCEDURE objects (separate from FUNCTION) only in version 11. Always check what your specific engine/version actually supports before assuming feature parity.

Anatomy of a trigger

CREATE OR REPLACE FUNCTION log_salary_change()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.salary <> OLD.salary THEN
        INSERT INTO salary_audit (employee_id, old_salary, new_salary, changed_at)
        VALUES (OLD.id, OLD.salary, NEW.salary, now());
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_log_salary_change
AFTER UPDATE ON employees
FOR EACH ROW
EXECUTE FUNCTION log_salary_change();
  • BEFORE vs AFTER: a BEFORE trigger can inspect and modify the row before it's written (e.g., auto-populating updated_at), and can even prevent the write entirely by raising an exception; an AFTER trigger runs once the change is already committed to the row and is typically used for side effects (logging, cascading updates) rather than modifying the row itself.
  • FOR EACH ROW vs statement-level: a row-level trigger fires once per affected row; a statement-level trigger fires once total, regardless of how many rows a single statement touched.
  • OLD and NEW: special row references available inside the trigger — OLD is the row's prior state (available for UPDATE/DELETE), NEW is the row's new state (available for INSERT/UPDATE).

Legitimate use cases

  • Audit logging — recording who changed what and when, guaranteed to fire regardless of which application or tool made the change.
  • Enforcing invariants too complex for a CHECK constraint — e.g., "the sum of allocations across related rows can never exceed a parent's capacity," which spans multiple rows/tables.
  • Maintaining denormalized/derived columns — keeping a cached order_count on a customers row in sync whenever orders changes, without every application code path having to remember to update it manually.
  • Enforcing referential integrity rules more complex than a standard foreign key can express (e.g., conditional referential integrity).

Common pitfalls

  • Hidden "action at a distance." A developer looking at UPDATE orders SET status = 'shipped' has no way to know, just from reading that line, that it also silently fires three triggers that update two other tables — this makes systems significantly harder to reason about and debug, especially for someone new to the codebase.
  • Performance overhead on high-throughput writes. Every trigger adds work to every matching INSERT/UPDATE/DELETE; a row-level trigger on a table with heavy write volume can become a meaningful bottleneck, especially if the trigger itself does further queries/writes.
  • Cascading and recursive triggers. A trigger on table A that writes to table B, which has its own trigger that writes back to table A, can create confusing chains of execution (or, if unguarded, infinite loops) that are very difficult to trace.
  • Silent failure surprises. If a BEFORE trigger raises an exception, the original statement fails too — this is usually desired (it enforces the invariant), but if the trigger's logic has a bug, it can cause completely unrelated-looking application writes to fail with a confusing error that doesn't obviously point back to the trigger.

Triggers are the right tool when an invariant or side effect must be enforced regardless of caller and cannot be expressed as a simpler constraint. For anything that could reasonably live in explicit application code instead (most business logic), prefer application code — it's visible, testable, and versioned alongside the rest of the system, rather than hidden in the schema.

Related Resources

Scalar function — returns one value

CREATE FUNCTION age_in_years(birth_date DATE)
RETURNS INT
LANGUAGE sql
IMMUTABLE
AS $$
    SELECT EXTRACT(YEAR FROM AGE(birth_date))::INT;
$$;

SELECT name, age_in_years(birth_date) AS age FROM people;

Used exactly like a built-in function (UPPER(), COALESCE()) — once per row, in a SELECT list, WHERE, ORDER BY, anywhere a single expression is valid.

Table-valued function — returns a full result set

CREATE FUNCTION orders_for_customer(cust_id INT)
RETURNS TABLE (order_id INT, order_date DATE, total NUMERIC)
LANGUAGE sql
AS $$
    SELECT id, order_date, total FROM orders WHERE customer_id = cust_id;
$$;

-- Used in FROM, like a table or view, but parameterized:
SELECT * FROM orders_for_customer(42) WHERE total > 100;

This is effectively a parameterized view — a regular view can't accept arguments, but a table-valued function can, letting you encapsulate a reusable, parameterized query the same way you'd encapsulate a parameterless one in a view.

Performance characteristics differ meaningfully

  • Scalar functions called per-row in a large query can be a significant performance trap — if a scalar function is invoked once per row of a million-row query and internally runs its own additional query, that's a million extra queries hiding behind what looks like a simple function call in the SELECT list. This is a common, easy-to-miss source of slow reports.
  • Table-valued functions, especially ones written in pure SQL (not a procedural language), are often inlined by the optimizer much like a view — the outer query's filters can sometimes be pushed down into the function's body, unlike a procedural scalar function that behaves as an opaque black box to the optimizer.

Prefer table-valued functions (or plain views/CTEs) over scalar functions for anything data-set-oriented — reserve scalar functions for lightweight, genuinely per-value computations (formatting, simple math, type conversions) that don't themselves need to query other tables. If you find yourself writing a scalar function that runs a SELECT internally and it's called across many rows, that's usually a sign the logic should be restructured as a join or table-valued function instead.

PostgreSQL (PL/pgSQL)

CREATE OR REPLACE PROCEDURE safe_transfer(from_acct INT, to_acct INT, amt NUMERIC)
LANGUAGE plpgsql
AS $$
BEGIN
    UPDATE accounts SET balance = balance - amt WHERE id = from_acct;

    IF (SELECT balance FROM accounts WHERE id = from_acct) < 0 THEN
        RAISE EXCEPTION 'Insufficient funds in account %', from_acct
            USING ERRCODE = 'insufficient_funds';
    END IF;

    UPDATE accounts SET balance = balance + amt WHERE id = to_acct;

EXCEPTION
    WHEN insufficient_privilege THEN
        RAISE NOTICE 'Permission issue, transfer aborted';
        RAISE;   -- re-raise to the caller after logging
    WHEN OTHERS THEN
        RAISE NOTICE 'Unexpected error: %', SQLERRM;
        RAISE;   -- never silently swallow unknown errors
END;
$$;

SQL Server (T-SQL)

BEGIN TRY
    BEGIN TRANSACTION;
    UPDATE accounts SET balance = balance - @amt WHERE id = @from_acct;

    IF (SELECT balance FROM accounts WHERE id = @from_acct) < 0
        THROW 50000, 'Insufficient funds', 1;

    UPDATE accounts SET balance = balance + @amt WHERE id = @to_acct;
    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    ROLLBACK TRANSACTION;
    THROW;   -- re-raise the original error to the caller
END CATCH;

Key principles

  1. Roll back on failure. Any partial writes made before the error must be undone — leaving the transfer example half-applied (debited but not credited) is exactly the atomicity violation ACID exists to prevent. ROLLBACK (explicit, or via automatic transaction abort) is essential in the error path.
  2. Catch specific conditions deliberately, don't blanket-swallow everything. A bare WHEN OTHERS (PL/pgSQL) or empty CATCH block that suppresses all errors silently hides real bugs and data problems — always at least log/re-raise unexpected errors rather than making them disappear.
  3. Use meaningful, application-actionable error signals. Raising a specific error code/message (insufficient_funds, a custom THROW with a clear message) lets the calling application distinguish "the transfer failed because of insufficient funds" (a business-logic condition the app should show the user) from "the transfer failed because of a database bug" (an operational alert).
  4. Re-raise unless you're specifically handling the condition. Catching an error to log it, then continuing as if nothing happened, is rarely correct — usually you want to log/handle and still propagate the failure so the caller (and any transaction it's part of) also reacts correctly.

This question tests whether a candidate treats database-level error handling with the same rigor as application-level error handling — expecting failures, ensuring atomicity is preserved on the failure path, and avoiding the anti-pattern of a catch-all block that silently discards information about what actually went wrong.