Statically enforcing frozen data classes in Python
You can use @dataclass(frozen=True) to make instances of a data class immutable during runtime. However, there's a small caveat - instantiating a frozen data class is slightly slower than a non-frozen one. This is because, when you enable frozen=True, Python has to generate setattr and delattr methods during class definition time and invoke them for each instantiation.
Below is a quick benchmark comparing the instantiation times of a mutable dataclass and a frozen one (in Python 3.12):
Running this prints:
So, frozen data classes are approximately 2.4 times slower to instantiate than their non-frozen counterparts. This gap can widen further if you compare slotted data classes (via @dataclass(slots=True)) with frozen ones. While the cost for immutability is small, it can add up if you need to create many frozen instances.
I was reading Tin Tvrtković's article on zero-overhead frozen attrs on making attrs instances frozen at compile time. He mentions how to leverage mypy to enforce instance immutability statically and use mutable attr classes at runtime to avoid any instantiation cost. I wanted to see if I could do the same with standard data classes.
Here's how to do it:
It involves:
- Using the type checker to ensure the data class instance is immutable.
- Replacing the immutable data class with a more performant mutable one at runtime.
The if TYPE_CHECKING condition only executes during type-checking. In that block, we use typing.dataclass_transform, introduced in PEP-681, to create a construct similar to the dataclass function that type checkers recognize.
The frozen_default flag, added in Python 3.12, makes this work seamlessly via PEP-681, but the code should also function in Python 3.11 without changes, as dataclass_transform accepts any keyword arguments. In Python 3.10 and earlier, you can import dataclass_transform from typing_extensions and leave the rest of the code as is.
The else ... block is what runs when you actually execute the code. There, we're just aliasing the vanilla dataclass function as frozen.
Running this code snippet results in:
However, mypy will flag an error since we're trying to mutate foo.x:
Voilà!
I struggled to figure this one out myself, and LLMs were of no help. So, I ended up posting a question on Stack Overflow, where someone pointed out how to use dataclass_transform to achieve this.
Fin!
Discussion in the ATmosphere