What are dataclasses, and how do they compare to plain classes, `namedtuple`, and `attrs`?

6 minintermediateoopdataclassesboilerplate

Quick Answer

`@dataclass` (PEP 557, Python 3.7+) auto-generates `__init__`, `__repr__`, and `__eq__` (and optionally `__lt__`/ordering, immutability, hashing) from type-annotated class attributes, removing the boilerplate of writing them by hand. Compared to `namedtuple`, dataclasses are mutable by default, support default values/methods naturally, and are regular classes (support inheritance); `attrs` predates dataclasses and offers more features (validators, converters) at the cost of a third-party dependency.

Detailed Answer

The boilerplate dataclasses remove

# Without dataclasses
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f"Point(x={self.x!r}, y={self.y!r})"

    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return (self.x, self.y) == (other.x, other.y)

# With dataclasses -- same behavior, generated automatically
from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

@dataclass reads the class's type annotations and generates __init__, __repr__, and __eq__ for you. Point(1, 2) == Point(1, 2) is True, and repr(Point(1, 2)) prints Point(x=1, y=2) — both for free.

Useful options

from dataclasses import dataclass, field

@dataclass(order=True, frozen=True)
class Money:
    cents: int
    currency: str = "USD"
    tags: list = field(default_factory=list)   # mutable default, done safely

m1 = Money(100)
m2 = Money(200)
m1 < m2            # True -- order=True generates __lt__ etc. (field-by-field)
m1.cents = 500      # FrozenInstanceError -- frozen=True makes it immutable

field(default_factory=...) solves the mutable-default-argument problem for dataclass fields specifically — you can't write tags: list = [] directly (dataclasses explicitly raise an error for mutable defaults without a factory).

Comparison to alternatives

Plain classnamedtuple@dataclassattrs
Boilerplateyou write everythingnone, but limitednonenone
Mutableyes (your choice)no (tuple-based)yes by default, frozen=True opts outeither
Methodsyesyes (limited)yesyes
Inheritancefullawkwardfullfull
Validators/convertersmanualnomanual (__post_init__)built-in
Dependencynonenone (stdlib)none (stdlib, 3.7+)third-party

namedtuple is best for genuinely tuple-like, immutable records accessed positionally as well as by name; @dataclass is the standard modern choice for "a class that's mostly data plus a few methods"; attrs remains popular when you need richer validation/conversion features than @dataclass's __post_init__ conveniently provides.

Interview-ready summary: @dataclass auto-generates __init__, __repr__, and __eq__ from annotated fields, removing the most common class boilerplate while staying a normal, mutable, inheritable class — reach for namedtuple for lightweight immutable tuples, and attrs when you need validators/converters beyond what @dataclass provides out of the box.