How do you model a many-to-many relationship in a relational schema?

4 minbeginnerdata-modelingmany-to-manyjunction-table

Quick Answer

Relational tables can only express one-to-one and one-to-many relationships directly, so a many-to-many relationship requires a **junction table** (also called an associative or bridge table) with foreign keys to both sides, typically with a composite primary key (or a surrogate key plus a `UNIQUE` constraint on the pair). The junction table can also carry attributes specific to the relationship itself, like `enrolled_at` on a student/course enrollment.

Detailed Answer

Consider students and courses: one student can take many courses, and one course can have many students — a genuine many-to-many.

The junction table

CREATE TABLE students (id SERIAL PRIMARY KEY, name VARCHAR(100));
CREATE TABLE courses  (id SERIAL PRIMARY KEY, title VARCHAR(200));

CREATE TABLE enrollments (
    student_id INT NOT NULL REFERENCES students(id),
    course_id  INT NOT NULL REFERENCES courses(id),
    enrolled_at TIMESTAMP NOT NULL DEFAULT now(),
    grade      CHAR(2),
    PRIMARY KEY (student_id, course_id)   -- composite PK: a student can't enroll in the same course twice
);

The composite primary key (student_id, course_id) both links the two tables and enforces that a given pairing can only exist once. If duplicate enrollments should be allowed (e.g., retaking a course in a different term), you'd instead use a surrogate id primary key plus a separate UNIQUE (student_id, course_id, term) constraint that includes whatever column distinguishes legitimate repeats.

Querying through the junction table

-- All courses a given student is enrolled in
SELECT c.title
FROM courses c
JOIN enrollments e ON e.course_id = c.id
WHERE e.student_id = 42;

-- All students enrolled in a given course, with their grade
SELECT s.name, e.grade
FROM students s
JOIN enrollments e ON e.student_id = s.id
WHERE e.course_id = 7;

Why the junction table is also the right place for relationship-specific attributes

A many-to-many relationship often has data that belongs to the pairing, not to either side alone — enrolled_at and grade above only make sense in the context of "this student in this course," not on students or courses individually. This is a strong signal you need a real junction table rather than trying to shoehorn the relationship into either parent table.

Indexing note

Beyond the composite primary key (which already indexes student_id first), it's usually worth adding a secondary index on course_id alone (or (course_id, student_id)) if you frequently query "students in this course" — a composite primary key (student_id, course_id) doesn't efficiently support lookups that start from course_id alone.