How do you model inheritance/polymorphic relationships in a relational schema?

6 minadvanceddata-modelinginheritancepolymorphismschema-design

Quick Answer

There's no native "inheritance" in relational modeling, so it's approximated with one of three patterns: **single table inheritance** (one wide table with a type discriminator column and nullable columns for subtype-specific fields), **class table inheritance** (a base table plus one table per subtype sharing the base's primary key), or **concrete table inheritance** (a fully separate table per subtype, duplicating shared columns). Each trades off query simplicity, storage efficiency, and referential integrity differently.

Detailed Answer

Suppose you have Vehicle as a concept, with subtypes Car (has num_doors) and Truck (has cargo_capacity). Relational databases have no built-in "subtype" concept, so you pick one of three patterns.

1. Single Table Inheritance (STI)

One table for all subtypes, with a discriminator column and nullable subtype-specific columns:

CREATE TABLE vehicles (
    id SERIAL PRIMARY KEY,
    type VARCHAR(20) NOT NULL,   -- discriminator: 'car' or 'truck'
    make VARCHAR(50),
    model VARCHAR(50),
    num_doors INT,               -- only meaningful when type = 'car'
    cargo_capacity NUMERIC(10,2) -- only meaningful when type = 'truck'
);

Pros: simplest queries (no joins to get a full record), easy to add a new shared field. Cons: many nullable columns that are only meaningful for some rows, no way to enforce "car rows must have num_doors set" with a plain NOT NULL (needs a CHECK tied to type), and the table grows wide as subtypes multiply.

2. Class Table Inheritance (CTI)

A base table plus one table per subtype, sharing the same primary key value:

CREATE TABLE vehicles (id SERIAL PRIMARY KEY, make VARCHAR(50), model VARCHAR(50));

CREATE TABLE cars (
    vehicle_id INT PRIMARY KEY REFERENCES vehicles(id),
    num_doors INT NOT NULL
);

CREATE TABLE trucks (
    vehicle_id INT PRIMARY KEY REFERENCES vehicles(id),
    cargo_capacity NUMERIC(10,2) NOT NULL
);

Pros: no wasted nullable columns, NOT NULL works normally within each subtype table, closely mirrors an OOP inheritance hierarchy. Cons: fetching a full car record requires a join across vehicles and cars; nothing in plain SQL stops a vehicle_id from (incorrectly) appearing in both cars and trucks without an extra constraint.

3. Concrete Table Inheritance

A fully separate, self-contained table per subtype, duplicating the shared columns:

CREATE TABLE cars (id SERIAL PRIMARY KEY, make VARCHAR(50), model VARCHAR(50), num_doors INT NOT NULL);
CREATE TABLE trucks (id SERIAL PRIMARY KEY, make VARCHAR(50), model VARCHAR(50), cargo_capacity NUMERIC(10,2) NOT NULL);

Pros: each table is simple and self-contained, no joins needed for a single subtype. Cons: querying "all vehicles regardless of type" needs a UNION, shared columns (make, model) are duplicated in every subtype's schema, and there's no single foreign-key target if some other table needs to reference "any vehicle."

  • Few, rarely-changing subtypes with mostly shared columns → STI is often good enough and keeps queries simple.
  • Many subtypes with substantially different, non-overlapping columns, and you need NOT NULL correctness per subtype → CTI.
  • Subtypes that are queried almost entirely independently and never need a unified "any vehicle" view → concrete table inheritance.

This is also a place where document databases (see the NoSQL topic) sometimes fit more naturally than a relational schema, since a document can freely have subtype-specific fields without any of these three tradeoffs — worth mentioning if the interview also touches polyglot persistence.