Let's talk about
annotations.
Type annotations in Python are mostly a static declaration to a type-checker
like
mypy or
pyright
about the expected types. However, they are also a dynamic data structure which a
growing number of libraries such as the original
attrs
and
dataclasses in the standard library, and
even
sqlalchemy
use at runtime.
>>> from dataclasses import dataclass
>>>
>>> @dataclass
... class C:
... a: int
... b: str
...
>>> C(1, "a")
C(a=1, b='a')
These libraries inspect the annotations of a class to
generate
__init__
and
__eq__
, saving
a lot of boilerplate code.
You could call this type of API
named tuple without the tuple.
(To get meta, the typing module has added
dataclass_transform
which libraries can use to properly annotate new class decorators with this API.)
These libraries support inheritance of fields.
>>> @dataclass
... class D(C):
... e: int
...
>>> D(1, "a", 2)
D(a=1, b='a', e=2)
Type checkers also consider class annotations to be inherited.
For example,
mypy
considers this to be correct:
class A:
a: int
class B(A): pass
B().a
That code fails at runtime, because nothing is actually setting
a
on
the
B
instance. But, what if
B
was a dataclass?
>>> class A:
... a: int
...
>>> @dataclass
... class B:
... pass
...
>>> B(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() takes 1 positional argument but 2 were given
It doesn't work, because annotations are not inherited.
>>> A.__annotations__
{'a': <class 'int'>}
>>> B.__annotations__
{}
It's up to the library to look up it's
inheritance tree and decide to include
the annotations of parents or not when generating code. As it happens,
dataclasses has
made the design decision to only inherit annotations from
other dataclasses.
As an aside, class variables which are used to represent default values are inherited.
>>> class A:
... a = 1
...
>>> @dataclass
... class B(A):
... a: int
...
>>> B()
B(a=1)
We can write another decorator which grabs annotations
from parents and adds them in
method resolution order, as if they were inherited.
def inherit_annotations(cls):
annotations = {}
for parent in cls.__mro__[::-1]:
# reverse order so children override parents
annotations.update(getattr(parent, "__annotations__", {}))
# use getattr(): not everything has __annotations__
cls.__annotations__.update(annotations)
return cls
Since all dataclasses sees is the
__annotations__
dict at runtime,
any modifications made before the class decorator runs will be reflected in the generated fields.
>>> @dataclass
... @inherit_annotations
... class B(A): pass
...
>>> B(1)
B(a=1)
Here's a robustified version of the function.
I know what you're thinking though: why not just use multiple class decorators? Sure, all but one of the generated
__init__
s
will be overwritten, but that's fine because they all have the same behavior anyway.
import attr
from dataclasses import dataclass
@dataclass
@attr.define
class DualCitizen:
a: int
@dataclass
class Dataclassified(DualCitizen):
pass
@attr.define
class Attrodofined(DualCitizen):
pass
Looks like perfectly normal class definitions.
>>> DualCitizen(1)
DualCitizen(a=1)
>>> Dataclassified(1)
Dataclassified(a=1)
>>> Attrodofined(1)
Attrodofined(1)
And it works.
So, type-checkers consider annotations to be inherited, but class decorators which use annotations at runtime
only inherit annotations from ancestors with the same decorator. We can work around
this either by multiply decorating the ancestors,
or by pulling annotations from ancestors into
__annotations__
.