{
  "$type": "site.standard.document",
  "canonicalUrl": "https://justingarrison.com/blog/2023-08-29-terraform-gitops-system-initiative",
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreih7a7e7fu42znlluvx5uexwcm4ndyf6oakojhvvme4ujldtj42zj4"
    },
    "mimeType": "image/png",
    "size": 40109
  },
  "description": "What are key differences between infrastructure management options in 2023",
  "path": "/blog/2023-08-29-terraform-gitops-system-initiative",
  "publishedAt": "2023-08-30T05:01:58.000Z",
  "site": "at://did:plc:p7uix7mresfq4nfzxp3klgfa/site.standard.publication/3mmdn7mg2qm2d",
  "textContent": "Infrastructure has come a long way in the past 5 years, and yet infrastructure as code has been stuck with the same promises and tools for much longer.\nThe advent of infrastructure APIs was revolutionary and shifted the status quo to writing code to create and maintain infrastructure instead of the manual processes that were widely used just a few years before.\n\nThe democratization of containers made the industry forget about measuring sysadmin skills in server uptime and instead move to how quickly and frequently you replaced your infrastructure.\nWe took declarative configuration management and traded it for user-data bash scripts and pre-baked AMIs.\n\nServers and configuration management are still thriving, but there's confusion around what infrastructure automation tools you should be using.\nEach tool comes with different strengths and weaknesses depending on what you're trying to accomplish.\nSo let's look at some of the main contenders and determine the appropriate use for each.\n\nTerraform\n\nTerraform is one of the most commonly used infrastructure management tools.\nIt has a broad ecosystem of modules—libraries used to provision infrastructure—and a well-known, portable workflow with plan and apply commands.\n\nTerraform uses a domain-specific language (DSL) called Hashicorp Configuration Language (HCL), which is used across the suite of Hashicorp products.\nHCL mostly looks and feels like human-friendly JSON when you get started, but after you use it for a while you realize just how different it is.\n\nHCL has loops, conditional statements, and much more that separates it from the humble \"configuration\" representation.\nThe language has grown in complexity to cover a variety of use cases in an attempt to keep the user experience simpler with regard to the tools that consume it.\n\nGetting started with Terraform is a great experience.\nHowever, day two activities are difficult because as infrastructure complexity grows, you end up needing to wrap Terraform with tools like Terragrunt to help keep your code DRY.\nBut wrappers are another tool to keep updated, and as Terraform, HCL, and infrastructure evolves, you have to create a matrix of supported versions and modules just to ensure you're able to keep using Terraform.\n\nTemplating infrastructure code isn't enough.\nWhen you have mission-critical production systems, you need more than a plan.\nYou need real tests and more accurate state representation.\n\nThe state of Terraform state management is a thorn you'll have to deal with eventually.\nAs common as it is to throw the state in s3, you'll soon discover the limits of using a JSON file as a database.\nIt could be one of a dozen things, but whatever it is, it will stop all deployments until someone goes spelunking through the state file to fix it.\nIf you've never had to edit a Terraform state file and hope your next apply works during an outage, count your blessings.\n\nTerraform also lacks functionality, such as built in policy enforcement.\nIf you're a HashiCorp enterprise customer, you may pay for access to Sentinel.\nIt integrates into the HashiCorp products, but it's clearly an afterthought.\nWriting policy in HCL might feel familiar for seasoned Terraform developers, but it lacks the expressiveness of a policy-specific language like Rego or a general purpose programming language.\n\nTerraform is a wonderful Infrastructure-as-Code tool, until it's not.\nUltimately, the complexity of infrastructure will outgrow HCL, or the proliferation of state files will cause headaches as outputs and terraform_remote_state data retrieval make you wonder why you can't use a real database.\n\nTerraform is a single-player tool in a multiplayer world.\nIt's great for infrastructure management when teams are small and complexity is under control.\nBut once you try to hide complexity in wrappers, abstractions, and deploy pipelines, it's time to look for other options.\n\nAWS CloudFormation isn't much better.\nWhile it does have automatic state management, it's not worth the inferior workflow and slower provisioning performance.\nIf you manage any infrastructure outside AWS, the CloudFormation extensions are severely lacking.\n\nCloudFormation hides some of the bad syntax authoring with the Cloud Development Kit (CDK), but generating declarative configuration from a general-purpose programming language is not going to solve the complications of StackSets or multi-account, multi-region management.\n\nCloudFormation is wonderful for templates and variables.\nIf you need 100 independent things that look similar and don't want to think about state files or plan runs, CloudFormation is a great option.\n\nGitOps\n\nGitOps tries to eliminate one of the biggest pain points of Terraform by reducing resource drift.\nThe longer it has been since something was deployed, the more likely what exists in production is not what Terraform thinks exists.\n\nInstead of only updating state when infrastructure is created, GitOps takes the Kubernetes controller approach and does continuous reconciliation.\nAs the GitOps controller reconciles infrastructure resources with the desired resource declaration, it syncs state back to the Git repository, which now becomes the \"source of truth.\"\n\nI call this method of infrastructure management \"infrastructure as software.\"\nBut it shouldn't be used for everything and has many shortcomings of its own.\nKubernetes is designed for controllers to work with continuous polling and watching in mind.\nTry to scale that approach on any cloud provider's APIs, and you'll learn all about throttling and limits.\n\nThe first question with GitOps is, What should be synced to the Git repo?\nShould pod replica counts be synced back to the Git repo?\nIf you run any sort of automatic node or pod scaler, then the answer is certainly no.\nYou'd end up with hundreds or thousands of Git commits coming from the controller, causing endless rebasing and conflicts.\nIf not everything is synced back to Git, how much truth is there?\n\nAnother problem is GitOps is keenly focused on Kubernetes clusters and resources.\nWhile Kubernetes can create external resources such as load balancers and storage, the experience of provisioning generic infrastructure through tools like Crossplane is far from elegant.\nIt may be familiar for a Kubernetes administrator, but extending Kubernetes to provision infrastructure in multiple regions, accounts, or clouds will cause cluster proliferation and permissive permissions that no one cares to think about.\n\nGitOps works for Kubernetes, but outside of that ecosystem, you end up writing HCL to manage everything else.\nYou also have no consistency with outputs to share resource state for in-cluster vs. out-of-cluster infrastructure.\nYou turn everything into a Kubernetes problem, and everyone who needs infrastructure needs to be a Kubernetes user.\n\nGitOps is fundamentally a single-user experience.\nYou can separate your users into different clusters or subfolders, and each team can choose their own YAML generation adventure, but is a Git repo better than a JSON file?\nWhy is infrastructure stuck using tools we'd never use for basic applications with similar complexity and criticality?\n\nYou should be using a fully featured database with compatibility for all of your infrastructure.\nYou can't require Kubernetes to be available to create the rest of your infrastructure.\nYou will end up with two management systems: the one that creates bootstrap Kubernetes clusters and the one that deploys from Kubernetes.\n\nSystem Initiative\n\nSystem Initiative (SI) is a new take on infrastructure management.\nDon't be fooled by the screenshots—this isn't your typical diagram-to-infrastructure tool from the past.\nThe vast majority of those old tools export HCL or CloudFormation and leave all the messy maintenance stuff to you.\nThey might track some basic deployed state of your infrastructure or allow partial imports from a running system, but they all fundamentally miss the mark in solving ongoing infrastructure management problems.\n\nSI not only provides an understandable user interface for your resources—called assets—but it also stores a representation of those assets in a database.\nNot a JSON file nor a Git repo, it is a proper, up-to-date database that can track dependencies and allow safe experimentation across all of your infrastructure.\n\nTerraform plan can be great for an individual's workflow, but making too many changes at once is not the feedback loop it was at the small scale.\nAttempting to explain a large change to a coworker asynchronously via a pull request is not going to end with a thorough PR review.\n\n!Cartoon of stick figures playing while terraform plan runs\n\nSystem Initiative makes multiplayer infrastructure management safer and part of its core functionality.\nIf you need to make a change, you create a change set and send a link to a coworker.\nIt brings pair programming together with visual infrastructure management.\n\nBecause SI assets are built with typescript, if you need to extend an asset definition or change validation, it's right there.\nNeed to add a policy?\nYou don't need to switch tools, contexts, or even your browser tab.\nIt can be done while you create your change set.\nA typical workflow for updating a core Terraform module or Kubernetes CRD has many more layers and synchronized pull requests just to get back to where you started.\n\nBy using modern application architecture and technologies, SI opens the door to more people owning their infrastructure's fate.\nIt doesn't make application developers learn HCL or Kubernetes, and it improves the UX to not be a third-rate \"DevOps\" job.\n\nSystem Initiative has the potential to bring application and infrastructure developers to the same table, speaking the same language and collaborating.\nThis is what DevOps was meant to be—not automation tools for ops and platforms for devs, but fast, safe, and reliable infrastructure and applications for everyone.\n\nThe biggest drawback I see with SI right now is it's not ready.\nThe e",
  "title": "Terraform vs. GitOps vs. System Initiative"
}