{
  "$type": "site.standard.document",
  "canonicalUrl": "https://rednafi.com/python/config-management-with-pydantic/",
  "description": "Build a scalable Python configuration system with Pydantic and dotenv. Manage dev, staging, and production configs with type validation.",
  "path": "/python/config-management-with-pydantic/",
  "publishedAt": "2020-07-13T00:00:00.000Z",
  "site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
  "tags": [
    "Python"
  ],
  "textContent": "Managing configurations in your Python applications isn't something you think about much\noften, until complexity starts to seep in and forces you to re-architect your initial\napproach. Ideally, your config management flow shouldn't change across different\napplications or as your application begins to grow in size and complexity.\n\nEven if you're writing a library, there should be a consistent config management process\nthat scales up properly. Since I primarily spend my time writing data-analytics,\ndata-science applications and expose them using [Flask] or [FastAPI] framework, I'll be\ntacking config management from an application development perspective.\n\nFew ineffective approaches\n\nIn the past, while exposing APIs with Flask, I used to use .env, .flaskenv and Config\nclass approach to manage configs which is pretty much a standard in the Flask realm.\nHowever, it quickly became cumbersome to maintain and juggle between configs depending on\ndevelopment, staging or production environments.\n\nThere were additional application specific global constants to deal with too. So I tried\nusing .json, .yaml or .toml based config management approaches but those too,\nquickly turned into a tangled mess. I was constantly accessing variables buried into 3-4\nlevels of nested toml data structure and it wasn't pretty.\n\nThen there are config management libraries like [Dynaconf] or [environ-config] that aim to\nameliorate the issue. While these are all amazing tools but they also introduce their own\ncustom workflow that can feel over-engineered while dealing with maintenance and extension.\n\nA pragmatic wishlist\n\nI wanted to take a declarative approach while designing a config management pipleline\nthat'll be modular, scalable and easy to maintain. To meet my requirements, the\nsystem should be able to:\n\n- Read configs from .env files and _shell environment_ at the same time.\n- Handle dependency injection for introducing _passwords_ or _secrets_.\n- Convert variable types automatically in the appropriate cases, e.g. string to integer\n  conversion.\n- Keep _development_, _staging_ and _production_ configs separate.\n- Switch between the different environments e.g development, staging effortlessly.\n- Inspect the _active_ config values\n- Create arbitrarily nested config structure if required (Not encouraged though).\n\nBuilding the config management pipeline\n\nPreparation\n\nThe code block that appears in this section is self contained. It should run without any\nmodifications. If you want to play along, then just spin up a Python virtual environment and\ninstall Pydantic and python-dotenv. The following commands works on any _\\nix_ based\nsystem:\n\nMake sure you have fairly a recent version of Python 3 installed, preferably\nPython 3.10+. You might need to install python3.10 venv.\n\nIntroduction to Pydantic\n\nTo check off all the boxes of the wishlist above, I made a custom config management flow\nusing [pydantic], [python-dotenv] and the .env file. Pydantic is a fantastic data\nvalidation library that can be used for validating and implicitly converting data types\nusing Python's type hints. Type hinting is a formal solution to statically indicate the type\nof a value within your Python code. It was specified in [PEP-484] and introduced in Python\n3.5. Let's define and validate the attributes of a class named User:\n\nThis will give you:\n\nIn the above example, I defined a simple class named User and used Pydantic for data\nvalidation. Pydantic will make sure that the data you assign to the class attributes conform\nwith the types you've annotated. Notice, how I've assigned a string type data in the\npassword field and Pydantic converted it to integer type without complaining. That's\nbecause the corresponding type annotation suggests that the password attribute of the\nUser class should be an integer. When implicit conversion is not possible or the hinted\nvalue of an attribute doesn't conform to its assigned type, Pydantic will throw a\nValidationError.\n\nThe orchestration\n\nNow let's see how you can orchestrate your config management flow with the tools mentioned\nabove. For simplicity, let's say you've 3 sets of configurations.\n\n1. Configs of your app's internal logic\n2. Development environment configs\n3. Production environment configs\n\nIn this case, other than the first set of configs, all should go into the .env file.\n\nI'll be using this .env file for demonstration. If you're following along, then go ahead,\ncreate an empty .env file there and copy the variables mentioned below:\n\nNotice how I've used the DEV_ and PROD_ prefixes before the environment specific\nconfigs. These help you discern between the variables designated for different environments.\n\n> Configs related to your application's internal logic should either be explicitly mentioned\n> in the same configs.py or imported from a different app_configs.py file. You shouldn't\n> pollute your .env files with the internal global variables necessitated by your\n> application's core logic.\n\nNow let's dump the entire config orchestration and go though the building blocks one by one:\n\nThe print statement of the last line in the above code block is to inspect the _active\nconfiguration_ class. You'll soon learn what I meant by the term _active configuration_. You\ncan comment out the last line while using the code in production. Let's explain what's going\non with each of the classes defined above.\n\nAppConfig\n\nThe AppConfig class defines the config variables required for you API's internal logic. In\nthis case I'm not loading the variables from the .env file, rather defining them directly\nin the class. You can also define and import them from another app_configs.py file if\nnecessary but they shouldn't be placed in the .env file. For data validation to work, you\nhave to inherit from Pydantic's BaseModel and annotate the attributes using type hints\nwhile constructing the AppConfig class. Later, this class is called from the\nGlobalConfig class to build a nested data structure.\n\nGlobalConfig\n\nGlobalConfig defines the variables that propagates through other environment classes and\nthe attributes of this class are globally accessible from all other environments. In this\nclass, the variables are loaded from the .env file. In the .env file, global variables\ndon't have any environment specific prefixes like DEV_ or PROD_ before them. The class\nGlobalConfig inherits from Pydantic's BaseSettings which helps to load and read the\nvariables from the .env file. The .env file itself is loaded in the nested Config\nclass. Although the environment variables are loaded from the .env file, Pydantic also\nloads your actual shell environment variables at the same time. From the Pydantic [Settings\nmanagement documentation]:\n\n> Even when using a dotenv file, Pydantic will still read environment variables as well as\n> the dotenv file, environment variables will always take priority over values loaded from\n> a dotenv file.\n\nThis means you can keep the sensitive variables in your .bashrc or zshrc and Pydantic\nwill inject them during runtime. It's a powerful feature, as it implies that you can easily\nkeep the insensitive variables in your .env file and include that to the version control\nsystem. Meanwhile the sensitive information should be injected as a shell environment\nvariable. For example, although I've defined an attribute called REDIS_PASS in the\nGlobalConfig class, there is no mention of any REDIS_PASS variable in the .env file.\nSo normally, it returns None but you can easily inject a _password_ into the REDIS_PASS\nvariable from the shell. Assuming that you've set up your venv and installed the\ndependencies, you can test it by copying the contents of the above code snippet in file\ncalled configs.py and running the commands below:\n\nThis should printout:\n\nNotice how your injected REDIS_PASS has appeared in the printed config class instance.\nAlthough I injected DEV_REDIS_PASS into the environment variable, it appeared as\nREDIS_PASS inside the DevConfig instance. This is convenient because you won't need to\nchange the name of the variables in your codebase when you change the environment. To\nunderstand why it printed an instance of the DevConfig class, refer to the\nFactoryConfig section.\n\nDevConfig\n\nDevConfig class inherits from the GlobalConfig class and it can define additional\nvariables specific to the development environment. It inherits all the variables defined in\nthe GlobalConfig class. In this case, the DevConfig class doesn't define any new\nvariable.\n\nThe nested Config class inside DevConfig defines an attribute env_prefix and assigns\nDEV_ prefix to it. This helps Pydantic to read your prefixed variables like\nDEV_REDIS_HOST, DEV_REDIS_PORT etc without you having to explicitly mention them.\n\nProdConfig\n\nProdConfig class also inherits from the GlobalConfig class and it can define additional\nvariables specific to the production environment. It inherits all the variables defined in\nthe GlobalConfig class. In this case, like DevConfig this class doesn't define any new\nvariable.\n\nThe nested Config class inside ProdConfig defines an attribute env_prefix and assigns\nPROD_ prefix to it. This helps Pydantic to read your prefixed variables like\nPROD_REDIS_HOST, PROD_REDIS_PORT etc without you having to explicitly mention them.\n\nFactoryConfig\n\nFactoryConfig is the controller class that dictates which config class should be activated\nbased on the environment state defined as ENV_STATE in the .env file. If it finds\nENV_STATE=\"dev\" then the control flow statements in the FactoryConfig class will\nactivate the development configs _(DevConfig)_. Similarly, if ENV_STATE=\"prod\" is found\nthen the control flow will activate the production configs _(ProdConfig)_. Since the current\nenvironment state is ENV_STATE=\"dev\", when you run the code, it prints an instance of the\nactivated DevConfig class. This way, you can assign different values to the same variable\nbased on different _environment contexts_.\n\nYou can also dynamically change the environment by changing the value of ENV_STATE on your\nshell. Run:\n\nThis time the config instance should change and print the following:\n\nAccessing the configs\n\nUsing the config variables is easy. Suppose you want use the variables in file called\napp.py. You can easily do so as shown in the following code block:\n\nThis should print out:\n\nExtending the pipeline\n\nThe modular design demonstrated above is easy to maintain and extend in my opinion.\nPreviously, for simplicity, I've defined only two environment scopes; development and\nproduction. Let's say you want to add the configs for your _staging environment_.\n\n- First you'll need to add those _staging_ variables to the .env file.\n\n- Then you've to create a class named StageConfig that inherits from the GlobalConfig\n  class. The architecture of the class is similar to that of the DevConfig or ProdConfig\n  class.\n\n- Finally, you'll need to insert an ENV_STATE logic into the control flow of the\n  FactoryConfig class. See how I've appended another if-else block to the previous (prod)\n  block.\n\nTo see your new addition in action just change the ENV_STATE to \"stage\" in the .env file\nor export it to your shell environment.\n\nThis will print out an instance of the class StageConfig.\n\nRemarks\n\nThe above workflow works perfectly for my usage scenario. So subjectively, I feel like it's\nan elegant solution to a very icky problem. Your mileage will definitely vary.\n\nFurther reading\n\n- [Settings management with Pydantic]\n- [Flask config management]\n\n\n\n\n[flask]:\n    https://github.com/pallets/flask\n\n[fastapi]:\n    https://github.com/tiangolo/fastapi\n\n[dynaconf]:\n    https://github.com/rochacbruno/dynaconf\n\n[environ-config]:\n    https://github.com/hynek/environ-config\n\n[pydantic]:\n    https://github.com/samuelcolvin/pydantic\n\n[python-dotenv]:\n    https://github.com/theskumar/python-dotenv\n\n[pep-484]:\n    https://www.python.org/dev/peps/pep-0484/\n\n[settings management documentation]:\n    https://pydantic-docs.helpmanual.io/usage/settings/\n\n[settings management with pydantic]:\n    https://pydantic-docs.helpmanual.io/usage/settings/\n\n[flask config management]:\n    https://flask.palletsprojects.com/en/1.1.x/config/",
  "title": "Pedantic configuration management with Pydantic"
}