What are the normal forms (1NF, 2NF, 3NF, BCNF)?

8 minintermediatenormalization1nf2nf3nfbcnfdata-modeling

Quick Answer

**1NF** requires atomic column values and no repeating groups. **2NF** requires 1NF plus every non-key column depending on the *whole* primary key, not just part of a composite key. **3NF** requires 2NF plus no non-key column depending on another non-key column (no transitive dependencies). **BCNF** (Boyce-Codd) tightens 3NF further: every determinant must be a candidate key. Each level removes a specific class of update/insert/delete anomaly.

Detailed Answer

Normalization is a sequence of rules for structuring tables so that each fact is stored exactly once, eliminating anomalies that occur when you insert, update, or delete data.

1NF — Atomic values, no repeating groups

Every column must hold a single, indivisible value — no comma-separated lists, no repeating column groups (phone1, phone2, phone3).

-- Violates 1NF: phones column holds multiple values
| id | name  | phones                |
|----|-------|------------------------|
| 1  | Alice | 555-1234, 555-5678     |

-- 1NF: one row per phone number
| id | name  | phone     |
|----|-------|-----------|
| 1  | Alice | 555-1234  |
| 1  | Alice | 555-5678  |

2NF — No partial dependency on a composite key

Applies when the primary key is composite (more than one column). Every non-key column must depend on the entire key, not just part of it.

-- order_items(order_id, product_id, product_name, quantity)
-- Primary key: (order_id, product_id)
-- Violation: product_name depends only on product_id, not on the full (order_id, product_id) key

Fix: move product_name to a separate products table keyed by product_id alone, and keep order_items holding only columns that truly depend on the composite key (like quantity).

3NF — No transitive dependency

A non-key column must depend on the key directly — not on another non-key column.

-- employees(id, name, department_id, department_name)
-- Violation: department_name depends on department_id, not directly on id (the key) —
-- this is a transitive dependency: id -> department_id -> department_name

Fix: move department_name into a departments table keyed by department_id; employees keeps only department_id as a foreign key.

BCNF — Every determinant is a candidate key

A stricter version of 3NF: for every functional dependency A -> B, A must be a candidate key. 3NF has a narrow exception BCNF closes — it can happen when a table has multiple overlapping composite candidate keys. Example: a table (student, course, instructor) where each instructor teaches only one course, but a course can have multiple instructors:

-- (student, course) is a candidate key; (student, instructor) is also a candidate key
-- but "instructor -> course" is a dependency where instructor is NOT a candidate key alone
-- This satisfies 3NF but violates BCNF, and can still produce redundancy/anomalies.

Why this matters practically

Each level removes a class of anomaly:

  • Update anomaly: a fact stored in multiple rows can go out of sync if only some copies are updated (e.g., department_name duplicated across every employee row).
  • Insert anomaly: you can't record a fact (e.g., a new department) until an unrelated fact (an employee in it) also exists.
  • Delete anomaly: deleting the last row referencing a fact accidentally deletes the fact itself (e.g., deleting the last employee in a department loses the department's name entirely).

Most production schemas target 3NF as a practical default, reaching for BCNF only when the overlapping-key scenario actually arises, and deliberately denormalizing past 3NF for specific, measured performance reasons (see the denormalization question).