Raw Record Source

{
  "$type": "site.standard.document",
  "canonicalUrl": "https://justingarrison.com/blog/2023-06-29-patterns-vs-platforms",
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreicx4bgey2n23i7tyewb7hxrerdts7srmkyfes7s6rfcpnvn7ylyiq"
    },
    "mimeType": "image/png",
    "size": 303813
  },
  "description": "You don't have to build a platform.",
  "path": "/blog/2023-06-29-patterns-vs-platforms",
  "publishedAt": "2023-06-30T04:17:25.000Z",
  "site": "at://did:plc:p7uix7mresfq4nfzxp3klgfa/site.standard.publication/3mmdn7mg2qm2d",
  "textContent": "The platform engineering hype wave is almost over—thank you AI, but too many companies are still full steam ahead building one.\nWhy do companies think they need a platform, and what's another option?\n\nWhat are platforms?\n\nCompanies believe the way to make more money is to ship code faster.\nThis means they need to either hire more developers—too expensive—or reduce the time it takes for code to be put into production.\n\nThey tried that DevOps thing and now CI/CD has been automated and they've achieved DevOps, and it's not good enough.\nSo they have to attack the problem from the other side of the Software Development Life Cycle (SDLC) and that means they need to build a platform.\n\nThey don't know what they want, but it has to be a platform.\nIn 2023 this means they usually want something that's declarative, containerized, and Kubernetesed (the past tense verb of Kubernetes).\n\nThe old platform they built in 2018 was garbage.\nThey built it themselves with RPM/DEB/JAR files and it had something to do with immutable.\n\nIt was all custom, never fully documented, and never achieved 50% adoption.\nBut this time it'll be different.\nThis time we have Kubernetes!\n\nThe main goal of platforms—then and now—is to reduce developer overhead.\nTo let them focus on writing code and not all the messy details about why and where that code runs.\nIt should hide the complexity to let SRE worry about availability.\n\nEven though we would never claim to \"throw code over the wall\" anymore, we've somehow recreated the silos from yesteryear.\n\nA \"platform\" is the production environment where stuff runs.\nIt's the API that CI/CD calls.\nIt's the tooling, monitoring, and observability that are combined together that allow things to work.\n\nAnd it's built by a team of platform engineers.\n\nThe platform team is like the database administer (DBA) team.\nIt's centralized to keep costs low with all the platform expertise in one place.\n\nCentralization is the easiest way to manage this type of function for most orginizations because they can reduce the overhead to a handful of people, they can manage them with a single manager, and they can reduce the inputs and outputs of the team to a single queue.\nAnd just like the DBA team, they will be overrun with requests, they will have an endless backlog, and they'll never get ahead.\n\nUnlike the DBA team, the platform team will fade away after the next migration.\n\nWhat came before platforms?\n\nBefore we had platforms we had templates.\nAnd by \"templates\" I mean there was an internal repo with a Jenkinsfile that every time you created a new app you would copy that Jenkinsfile into your new repo and modify it.\n\nThat was if you were lucky enough to have a version of Jenkins that used Jenkinsfiles.\nMost of the time you had to open the Jenkins web interface and copy a pipeline.\nFully non-declarative and no way to track changes.\n\nTemplates were just working patterns.\nSomeone got it working and everyone else started from that.\n\nIn our custom environment, with our requirements and limitations, there were only a handful of known-good patterns.\nEverything else had to start there.\n\nThen, someone made a CLI that automatically spit out the Jenkinsfile you wanted with something like\n\nLater the init command included other template files for things like a README, a .gitignore, and other boilerplate needed to get started.\nIt was probably written in bash or node because why not.\n\nAs more groups started to experiment with more types of services written in different languages the tool was re-written to accomidate more flags.\nThis was going to become the ultimate CLI, but the person writing it left, got bored, or got inundated with new features and it was never finished.\n\nAnd it was never documented.\n\nAnd then the company had to re-platform.\n\nIf there's been one thing that has been constant in my 20+ year in tech, it's that we're always in a state of migration.\nIt never stops.\n\nSo why will it stick this time?\n\nThe Kubernetes platform\n\nBecause Kubernetes is extensible it can be used for anything.\nSo we try to use the \"anything API\" for everything.\n\nThe building blocks are lighter weight because the patterns already exist.\n\nI can create brand new resources with YAML.\nI can make those resources do things with a reasonably small amount of code (dozens or hundreds of lines).\nI can even make those resources do things to external APIs with open source controllers like Crossplane.\n\nThen with another few dozen lines of code I can have a custom CLI that used to take 6+ months to build.\n\nThis ease of extending gives us an idea that this time the platform will stick.\nThis time we won't have to migrate.\nWe can keep extending the platform in new ways and it'll always scale to our needs.\n\nThe core Kubernetes resources are good enough and the platform bits we have to build are minimal overhead.\n\nBut the reality sinks in that you could end up with dozens or hundreds of Kubernetes clusters.\nYou won't have time to extend Kubernetes because you'll be upgrading clusters and plugins until the end of time.\n\nYou'll spend so much time checking compatability of components you'll build more tooling to manage Kubernetes clusters than extending it.\nThis is why GitOps and Cluster API were created.\nAnd the app teams will hate you.\n\nYou will either become a bottleneck for functionality they can't have, or you'll be nagging them to migrate to a new cluster, API version, or some other supporting format.\n\nOther options\n\nIt's very refreshing in the serverless/lambda world that there are no platform teams.\nLambda functions and infrastructure is so closely intertwined that you cannot separate them.\n\nThey deploy with the same CLI from the same repo.\nApplication code and infrastructure code co-exist and it makes perfect sense why the AWS Cloud Development Kit is so popular with these developers.\n\nIn the Kubernetes world we have Pulumi which is great, but because we have a platform team we tend to separate our application code from our infrastructure code.\nWhen both are managed in the same repo, by the same team, with the same deployment tool the platform fades away.\n\nTools build barriers and until platforms use the same tools as the applications there will always be silos.\n\nWith Lambda there's the Serverless Framework or AWS SAM and for AWS container services there's AWS Copilot, but I've never found a tool as flexible or complete for Kubernetes.\n\nIf Kubernetes is my platform API then all infrastructure I need to provision has to have some CRD representation in the cluster.\nYou can go the terraform route and define all internal and external resources, but getting application developers to write or care about terraform is not the solution either.\n\nEven if you can declare all of the infrastructure you're still going to be missing external dependencies like PagerDuty schedules, Datadog alerts, and any other resource you depend on.\n\nThese tools are not a platform.\nIn many ways we've tried them in the past and were unsuccessful.\nFor environments that treat applications and infrastructure the same they were wildly sucessful.\n\nCLI wrappers for application and infrastructure deployments is the sweet spot between patterns and platforms.\nYour platform team only has time to manage Kubernetes so let them vend minimal Kubernetes—just like the DBA team does for databases.\n\nIf you don't have a Kubernetes based platform then you don't need a team to manage it for everyone.\nApplication teams are probably capable of managing it for their needs if they deem it necessary.\n\nWhat needs to be platformed?\n\nThe only things that really need to be centralized are the things you need to audit.\n\nSecurity and availability should be your only blockers.\n\nWhen the next CVE comes out you need to have a way to report on what applications are vulnerable.\nWhen the site goes down you have to make sure the problem doesn't repeat itself the same way.\n\nA security platform has multiple layers, but as long as common patterns are followed it can probably stay as a set of auditing tools.\nWhen new technologies are introduced teams can consult with security and added to the patterns for other applications to use.\n\nReliability needs to be well defined based on business needs and should be visible to everyone.\nIf only my team ever sees my reliability metrics then dependant teams will have a very difficult job knowing what their SLOs should be.\n\nDon't build a platform because you think you need one.\nIf you can, let applications manage infrastructure with the same tooling as applications.\n\nIf your applications can use Kubernetes basics—compute, load balancer, disk—then by all means use Kubernetes.\nBut avoid making it the everything API just because you can.\n\nIt's a lot of fun, until you have to upgrade it and migrate all applications 3 times a years.",
  "title": "Patterns vs Platforms"
}