Declaratively transform data class fields in Python
While writing microservices in Python, I like to declaratively define the shape of the data coming in and out of JSON APIs or NoSQL databases in a separate module. Both TypedDict and dataclass are fantastic tools to communicate the shape of the data with the next person working on the codebase.
Whenever I need to do some processing on the data before starting to work on that, I prefer to transform the data via dataclasses. Consider this example:
The above class defines the structure of a payload that'll be saved in a DynamoDB table. To make things simpler, I want to serialize the request_payload, response_payload, and status_code fields to JSON string before saving them to the DB. Usually, I'd do it in the to_dynamodb_item like this:
However, keeping track of this json.dumps transformation that's buried in a method can be difficult. Also, it can be hard to track the fields that need to be deserialized whenever you want the rich data structures back. Another disadvantage is that you'll have to perform the same transformation again if you need serialized fields in another method. A better way is to take advantage of the post_init hook exposed by dataclasses. Here's how you can do it:
Running the script will print the following:
Notice, how the intended fields are now JSON encoded. Python calls the post_init hook of a dataclass after calling the init method. If you don't generate any init by decorating the target class with @dataclass(init=False), in that case, the post_init hook won't be executed.
The field function with repr=False allows us to exclude the configuration fields like _json_transform and _json_fields from the final repr of the class. Notice that these two fields are absent in the final representation of the dataclass instance.
You can turn off the JSON conversion by setting the _json_transform to False:
You can also add or remove fields to be transformed by changing the value of the _json_fields iterable of the class:
This will only serialize the status_code field. Neat!
Further reading
Discussion in the ATmosphere