{
"$type": "site.standard.document",
"canonicalUrl": "https://rednafi.com/python/mixins/",
"description": "Build custom data structures with Python's collection.abc mixins, abstract base classes, and interfaces for dict-like and set-like objects.",
"path": "/python/mixins/",
"publishedAt": "2020-07-03T00:00:00.000Z",
"site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
"tags": [
"Python",
"Data Structures"
],
"textContent": "Imagine a custom _set-like_ data structure that doesn't perform hashing and trades\nperformance for tighter memory footprint. Or imagine a _dict-like_ data structure that\nautomatically stores data in a PostgreSQL or Redis database the moment you initialize it;\nalso it lets you _get-set-delete_ key-value pairs using the usual\n_retrieval-assignment-deletion_ syntax associated with built-in dictionaries. Custom data\nstructures can give you the power of choice and writing them will make you understand how\nthe built-in data structures in Python are constructed.\n\nOne way to understand how built-in objects like dictionary, list, set etc work is to build\ncustom data structures based on them. Python provides several mixin classes in the\ncollection.abc module to design custom data structures that look and act like built-in\nstructures with additional functionalities baked in.\n\nConcepts\n\nTo understand how all these work, you'll need a fair bit of knowledge about Interfaces,\nAbstract Base Classes, Mixin Classes etc. I'll build the concept edifice layer by layer\nwhere you'll learn about interfaces first and how they can be created and used via the\nabc.ABC class. Then you'll learn how abstract base classes differ from interfaces. After\nthat I'll introduce mixins and explain how all these concepts can be knitted together to\narchitect custom data structures with amazing capabilities. Let's dive in.\n\nInterfaces\n\nPython interfaces can help you write classes based on common structures. They ensure that\nclasses that provide similar functionalities will also have similar footprints. Interfaces\naren't as popular in Python as they are in other statically typed languages. The dynamic\nnature and duck-typing capabilities of Python often make them redundant.\n\nHowever, in larger applications, interfaces can make you avoid writing code that is poorly\nencapsulated or build classes that look awfully similar but provide completely unrelated\nfunctionalities. Moreover, interfaces implicitly spawn other powerful techniques like mixin\nclasses which can help you achieve DRY nirvana.\n\nOverview\n\nAt a high level, an interface acts as a blueprint for designing classes. In Python, an\ninterface is a specialized abstract class that defines one or more abstract methods.\nAbstract classes differ from concrete classes in the sense that they aren't intended to\nstand on their own and the methods they define shouldn't have any implementation.\n\nUsually, you inherit from an interface and implement the methods defined in the abstract\nclass in a concrete subclass. Interfaces provide the skeletons and concrete classes provide\nthe implementation of the methods based on those skeletons. Depending on the ways you can\narchitect interfaces, they can be segmented into two primary categories.\n\n- Informal Interfaces\n- Formal Interfaces\n\nInformal interfaces\n\nInformal interfaces are classes which define methods that can be overridden, but there's no\nstrict enforcement.\n\nLet's write an informal interface for a simple calculator class:\n\nNotice that the ICalc class has four different methods that don't give you any\nimplementation. It's an informal interface because you can still instantiate the class but\nthe methods will raise NotImplementedError if you try to apply them. You've to subclass\nthe interface to use it. Let's do it:\n\nNow, you might be wondering why you even need all of these boilerplate code and inheritance\nwhen you can directly define the concrete Calc class and call it a day.\n\nConsider the following scenario where you want to add additional functionalities to each of\nthe method of the Calc class. Here, you've two options. Either you can mutate the original\nclass and add those extra functionalities to the methods or you can create another class\nwith similar footprint and implement all the methods with the added functionalities.\n\nThe first option isn't always viable and can cause regression in real life scenario. The\nsecond approach ensures modularity and is generally quicker to implement since you won't\nhave to worry about messing up the original concrete class. However, figuring out which\nmethods you'll need to implement in the extended class can be hard because the concrete\nclass might have additional methods that you don't want in the extended class.\n\nIn this case, instead of figuring out the methods from the concrete Calc class, it's\neasier to do so from an established structure defined in the ICalc interface. Interfaces\nmake the process of extending class functionalities more tractable. Let's make another class\nthat will add logging to all of the methods of the Calc class:\n\nIn the above class, I've defined another class called CalcLog that basically extends the\nfunctionalities of the previously defined Calc class. Here, I've inherited from the\ninformal interface ICalc and implemented all the methods with additional info logging\ncapability.\n\nAlthough writing informal interfaces is trivial, there are multiple issues that plague them.\nThe user of the interface class can still instantiate it like a normal class and won't be\nable to tell the difference between it and a concrete class until she tries to use any of\nthe methods define inside the interface. Only then the methods will throw exceptions. This\ncan have unintended side effects.\n\nMoreover, informal interfaces won't compel you to implement all the methods in the\nsubclasses. You can easily get away without implementing a particular method defined in the\ninterface. It won't complain about the unimplemented methods in the subclasses. However, if\nyou try to use a method that hasn't been implemented in the subclass, you'll get an error.\nThis means even if issubclass(ConcreteSubClass, Interface) shows True, you can't rely on\nit since it doesn't give you the guarantee that the ConcreteSubClass has implemented all\nthe methods defined in the Interface.\n\nLet's create another class FakeCalc and only implement one method defined in the ICalc\nabstract class:\n\nDespite not implementing all the methods defined in the ICalc class, I was still able to\ninstantiate the FakeCalc concrete class. However, when I tried to apply a method sub\nthat wasn't implemented in the concrete class, it gave me an error. Also,\nissubclass(FakeCalc, ICalc) returns True which can mislead you into thinking that all\nthe methods of the subclass FakeCalc are usable. It can cause subtle bugs can be difficult\nto detect. Formal interfaces try to overcome these issues.\n\nFormal interfaces\n\nFormal interfaces don't suffer from the problems that plague informal interfaces. So if you\nwant to implement an interface that the users can't initiate independently and that forces\nthem to implement all the methods in the concrete sub classes, formal interface is the way\nto go. In Python, the idiomatic way to define formal interfaces is via the abc module.\nLet's transform the previously mentioned ICalc interface into a formal one:\n\nHere, I've imported ABC class and abstractmethod decorator from the abc module of\nPython's standard library. The name ABC stands for _Abstract Base Class_. The interface\nclass needs to inherit from this ABCclass and all the abstract methods need to be\ndecorated using the abstractmethod decorator. If your knowledge on decorators is fuzzy,\ncheckout this in-depth article on [Python decorators].\n\nAlthough, it seems like ICalc has merely inherited from the ABC class, under the hood, a\n[metaclass] ABCMeta gets attached to the interface which essentially makes sure that you\ncan't instantiate this class independently. Let's try to do so and see what happens:\n\nThe error message clearly states that you can't instantiate the class ICalc directly at\nall. You have to make a subclass of ICalc and implement all the abstract methods and only\nthen you'll be able to make an instance of the subclass. The subclassing and implementation\npart is same as before.\n\nIn the case of formal interface, failing to implement even one abstract method in the\nsubclass will raise TypeError. So you can never write something the like the FakeCalc\nwith a formal interface. This approach is more explicit and if there is an issue, it fails\nearly.\n\nInterfaces vs abstract base classes\n\nYou've probably seen the term _Interface_ and _Abstract Base Class_ being used\ninterchangeably. However, conceptually they're different. Interfaces can be thought of as a\nspecial case of Abstract Base Classes.\n\nIt's imperative that all the methods of an interface are abstract methods and the classes\ndon't store any state (instance variables). However, in case of abstract base classes, the\nmethods are generally abstract but there can also be methods that provide implementation\n(concrete methods) and also, these classes can have instance variables. These generic\nabstract base classes can get very interesting and they can be used as _mixins_ but more on\nthat in the later sections.\n\nBoth interfaces and abstract base classes are similar in the sense that they can't stand on\ntheir own, that means these classes aren't meant to be instantiated independently. Pay\nattention to the following snippet to understand how interfaces and abstract base classes\ndiffer.\n\nInterface\n\nHere, all the methods must have to be abstract.\n\nAbstract Base Class\n\nNotice how method_c in the above class is a concrete method and can have implementation.\n\nThe two examples above establish the fact that\n\n> All interfaces are abstract base classes but not all abstract base classes are interfaces.\n\nA complete example\n\nBefore moving on to the next section, let's see another contrived example to get the idea\nabout the cases where interfaces can come handy. I'll define an interface called\nAutoMobile and create three concrete classes called Car, Truck and Bus from it. The\ninterface defines three abstract methods start, accelerate and stop that the concrete\nclasses will need to implement later.\n\n![UML diagram showing mixin class composition pattern in Python][image_1]\n\nThe above example delineates the use cases for interfaces. When you need to create multiple\nsimilar classes, interfaces can provide a basic foundation for the subclasses to build upon.\nIn the next section, I'll be using formal interfaces to create Mixin classes. So, before\nunderstanding mixin classes and how they can be used to inject additional plugins to your\nclasses, it's important that you understand interfaces and abstract base classes properly.\n\nMixins\n\nImagine you're baking chocolate brownies. Now, you can have them without any extra fluff\nwhich is fine or you can top them with cream cheese, caramel sauce, chocolate chips etc.\nUsually you don't make the extra toppings yourself, rather you prepare the brownies and use\noff the shelf toppings. This also gives you the ability to mix and match different\ncombinations of toppings to spruce up the flavors quickly. However, making the toppings from\nscratch would be a lengthy process and doing it over and over again can ruin the fun of\nbaking.\n\nWhile creating software, there's sometimes a limit to the depth we should go. When pieces of\nwhat we'd like to achieve have already been executed well by others, it makes a lot of sense\nto reuse them. One way to achieve modularity and reusability in object-oriented programming\nis through a concept called a mixin. Different languages implement the concept of mixin in\ndifferent ways. In Python, mixins are supported via multiple inheritance.\n\nOverview\n\nIn the context of Python especially, a mixin is a parent class that provides functionality\nto subclasses but isn't intended to be instantiated itself. This should already incite deja\nvu in you since classes that aren't intended to be instantiated and can have both\nconcrete and abstract methods are basically abstract base classes. Mixins can be\nregarded as a specific strain of abstract base classes where they can house both concrete\nand abstract methods but don't keep any internal states.\n\nThese can help you when -\n\n- You want to provide a lot of optional features for a class.\n- You want to provide a lot of not-optional features for a class, but you want the features\n in separate classes so that each of them is about one feature (behavior).\n- You want to use one particular feature in many different classes.\n\nLet's see a contrived example. Consider [Werkzeug]'s request and response system. Werkzeug\nis a small library that [Flask] depends on. I can make a plain old request object by saying:\n\nIf I want to add accept header support, I would make that:\n\nIf I wanted to make a request object that supports accept headers, etags, user agent and\nauthentication support, I could do this:\n\nThe above example might cause you to say, \"that's just multiple inheritance, not really a\nmixin\", which is can be true in a special case. Indeed, the differences between plain old\nmultiple inheritance and mixin based inheritance collapse when the parent class can be\ninstantiated. Understanding the subtlety in the differences between a mixin class, an\nabstract base class, an interface and the scope of multiple inheritance is important, so\nI'll explore them in a dedicated section.\n\nDifferences between interfaces, abstract classes and mixins\n\nIn order to better understand mixins, it'd be useful to compare mixins with abstract classes\nand interfaces from a code/implementation perspective:\n\nInterfaces\n\nInterfaces can contain abstract methods only, no concrete methods and no internal states\n(instance variables).\n\nAbstract Classes\n\nAbstract classes can contain abstract methods, concrete methods and internal state.\n\nMixins\n\nLike interfaces, mixins don't contain any internal state. But like abstract classes, they\ncan contain one or more concrete methods. _So mixins are basically abstract classes\nwithout any internal states._\n\nIn Python, these are just conventions because all of the above are defined as classes.\nHowever, one trait that is common among _interfaces_, _abstract classes_ and _mixins_ is\nthat they shouldn't exist on their own, i.e. shouldn't be instantiated independently.\n\nA complete example\n\nBefore diving into the real-life examples and how mixins can be used to construct custom\ndata structures, let's have a look at a self-contained example of a mixin class at work:\n\nThe FactorMult class takes in a number as a factor and the multiply method simply\nmultiplies an argument with the factor. The mixin class DisplayFactorMult provides an\nadditional method multiply_show that enhances the multiply method of the concrete class.\nMethod multiply_show prints the value of the factor, arguments and the result before\nreturning the result. Here, DisplayFactoryMult is a mixin since it houses an abstract\nmethod multiply, a concrete method multiply_show and doesn't store any instance\nvariable.\n\nIf you really want to dive deeper into mixins and their real-life use cases, checkout the\ncodebase of the [requests library]. It defines and employs many powerful mixin classes to\nbestow superpowers upon different concrete classes.\n\nBuilding powerful custom data structures with mixins\n\nYou've reached the hall of fame where I'll be building custom data structures using the\nmixin classes from the collections.abc module.\n\nVerbose tuple\n\nThis is a tuple-like data structure that acts exactly like the built-in tuple but with one\nexception. It'll print out the special methods underneath when you perform any operation\nwith it.\n\nTo build the VerboseTuple data structure, first, I've inherited the Sequence mixin class\nfrom the collections.abc module. The docstring mentions all the abstract and mixin methods\nprovided by the Sequence class. To build the new data structure, you'll have to implement\nall the abstract methods defined in the Sequence class and you'll get all the mixin\nmethods implemented automatically. Notice that the print statement above also reveals the\nabstract and the mixin methods.\n\nIn the following snippet I've used some of the functionalities offered by tuple and printed\nthem in a way that will reveal the special methods when they perform any action.\n\nThe printed statements reveal the corresponding special methods used internally when a\nparticular tuple operation occurs.\n\nVerbose list\n\nThis is a list-like data structure that acts exactly like the built-in list but with one\nexception. Like VerboseTuple, it'll also print out the special methods underneath when you\nperform any operation on or with it.\n\nIn the above segment, I've inherited the MutableSequence mixin class from the\ncollections.abc module. This ensures that the VerboseList object will be mutable. All\nthe abstract methods mentioned in the docstring have been implemented and the output print\nstatements reveal the structure of the custom data structure as well as all the abstract and\nmixin methods.\n\nIn the following snippet, I've used some of the functionalities offered by list and printed\nthem in a way that will reveal the special methods when they perform any action.\n\nVerbose frozen dict\n\nHere, VerboseFrozenDict is an immutable data structure that is similar to the built-in\ndictionaries. Like the previous structures, this also reveals the internal special methods\nwhile performing different operations.\n\nIn the above segment, I've inherited the Mapping mixin class from the collections.abc\nmodule. This ensures that the output sequence will be immutable. Just like before, all the\nabstract methods mentioned in the docstring have been implemented and the output print\nstatements reveal the structure of the custom data structure, all the abstract and mixin\nmethods.\n\nBelow the printed output will reveal the special methods used internally when the\nVerboseFrozenDict objects perform any operation.\n\nVerbose dict\n\nThe VerboseDict data structure is the mutable version of VerboseFrozenDict. It supports\nall the operations of VerboseFrozenDict with some additional features like adding and\ndeleting key-value pairs, updating values corresponding to different keys etc.\n\nThe output statements reveal the structure of the VeboseDict class and the abstract and\nmixin methods associated with it. The following snippet will print the special methods used\ninternally by the custom data structure (also in the built-in one) while performing\ndifferent operations.\n\nGoing ballistic with custom data structures\n\nThis section discusses two advanced data structures that I mentioned at the beginning of the\npost.\n\n- BitSet : Mutable set-like data structure that doesn't perform hashing.\n- SQLAlchemyDict: Mutable dict-like data structure that can store key-value pairs in any\n SQLAlchemy supported relational database.\n\nBitSet\n\nThis mutable set-like data structure doesn't perform hashing to store data. It can store\nintegers in a fixed range. While storing integers, BitSet objects use less memory compared\nto built-in sets.\n\nHowever, since no hashing happens, it's slower to perform addition and retrieval compared to\nbuilt-in sets. The following code snippet was taken directly from [Raymond Hettinger's 2019\nPyCon Russia talk] on advanced data structures.\n\nLet's inspect the above data structure to understand exactly how much memory we can save.\nI'll digress a little here. Normally, you'd use sys.getsizeof to measure the memory\nfootprint of an object where the function reveals the size in bytes.\n\nBut there's a problem. The function sys.getsizeof only reveals the size of the target\nobject, excluding the objects the target objects might be referring to. To understand what I\nmean, consider the following situation:\n\nSuppose, you have a nested list that looks like this:\n\nWhen you apply sys.getsizeof function on the list, it shows 96 bytes. This means only the\noutermost list consumes 96 bytes of memory. Here, sys.getsizeof doesn't include the size\nof the nested lists.\n\nThe same is true for other data structures. In case of nested dictionaries, sys.getsizeof\nwill not include the size of nested data structures. I'll only reveal the size of the\noutermost dictionary object. The following snippet will traverse through the reference tree\nof a nested object and reveal the _true_ size of it.\n\nLet's use the deep_getsizeof to inspect the size differences between built-in set and\nBitSet objects.\n\nThe output of the print statements reveal that the BitSet object uses less than half the\nmemory compared to its built-in counterpart!\n\nSQLAlchemyDict\n\nHere goes the second type of custom data structure that I mentioned in the introduction.\nIt's also a mutable dict-like structure that can automatically store key-value pairs to any\nSQLAlchemy supported relational database when initialized.\n\nI was inspired to write this one from the same Raymond Hettinger talk that I mentioned\nbefore. For demonstration purposes, I've chosen SQLite database to store the key value\npairs.\n\nThis structure gives you immense power since you can abstract away the entire process of\ndatabase communication inside the custom object. You'll perform get-set-delete operations\non the object just like you'd do so with built-in dictionary objects and the custom object\nwill take care of storing and updating the data to the target database.\n\nBefore running the code snippet below, you'll need to install SQLAlchemy as an external\ndependency.\n\nRunning the above code snippet will create a SQLite database named foo.db in your current\nworking directory. You can inspect the database with any database viewer and find your\nkey-value pairs there. Everything else is the same as a built-in dictionary object.\n\nFurther reading\n\n- [Implementing an interface in Python - Real Python]\n- [What is a mixin, and why are they useful? - Stackoverflow]\n- [Mixins for fun and profit - Dan Hillard]\n\n\n\n\n[python decorators]:\n /python/decorators\n\n[metaclass]:\n /python/metaclasses\n\n[werkzeug]:\n https://werkzeug.palletsprojects.com/en/latest/\n\n[flask]:\n https://flask.palletsprojects.com/\n\n[requests library]:\n https://github.com/psf/requests/blob/8149e9fe54c36951290f198e90d83c8a0498289c/requests/models.py#L60\n\n[raymond hettinger's 2019 pyCon russia talk]:\n https://www.youtube.com/watch?v=S_ipdVNSFlo\n\n[implementing an interface in python - real python]:\n https://realpython.com/python-interface/\n\n[what is a mixin, and why are they useful? - stackoverflow]:\n https://stackoverflow.com/questions/533631/what-is-a-mixin-and-why-are-they-useful\n\n[mixins for fun and profit - dan hillard]:\n https://easyaspython.com/mixins-for-fun-and-profit-cb9962760556\n\n[image_1]:\n https://blob.rednafi.com/static/images/mixins/img_1.png",
"title": "Interfaces, mixins and building powerful custom data structures in Python"
}