Self hosting your PDS with Podman
Bluesky is great (as evidenced by my multiple posts on it), but I wanted to own my data more than it living on the Bluesky servers. Now I could have just used the standard deployment as shown in the PDS repo, but I wanted to do something a little more fitting my deployment practices.
Why
A few years ago, I moved my webserver (what runs this site, my PDS, and lots of other services) to CentOS Stream back when that was still viable, then Rocky Linux as the new replacement when Stream sort of died off. Is it perfect, is it the right way to do things? No, but it's what I prefer at this point. [^1]
The Problem: part one
The default Bluesky PDS is intended for being deployed on Ubuntu 20.04/22.04. It will fail to deploy on anything other than those two distributions (even 24.04 which also works but that's neither here nor there).
Solutioning a problem
While I was in the testing phase of deploying my PDS, I was using an Ubuntu server that I could successfully deploy using the installer. I already use Caddy though, so the compose file having Caddy dockerized is not ideal. Fixing that was a pretty easy cleanup to get a proper pds.env file out of it, since there's no canonical source outside of the installer. Now I had a relatively standard PDS deployment using Docker, but using the Caddy service I use for everything else on my server.
Best Practices?
I have opinions on the way Bluesky PBLLC has made the systemd unit for the PDS in GitHub. It deploys as a oneshot, so journalctl is not super useful, systemctl will never show it as running, and it is generally just hard to set it as a dependency or a dependent of any other services due to it being oneshot.
Fixing this took me on a rabbit hole / detour on how to do it "better".[^2]
Enter Sandman Podman
I had been watching my coworkers explore Podman a bit at work, and looking into it, it had the perfect thing for my needs, the ability to "systemd-ize" a container workload in a way that made sense for both the container, and for systemd. I also took this opportunity to set up litestream for backups as well.
Now that everything's mostly stable, I can finally document all my configuration for anyone else to use.
Under the hood it's using Quadlets, so all of the unit files we'll create in /etc/containers/systemd.
First we create a pod to contain everything and publish the port:
Then we slot in a volume:
Now that we've got everything organized, we can bring in the PDS container itself:
I make this container boot before my Caddy service which brings up all the associated resources as part of the quadlet.
We then make the appropriate changes to the Caddyfile:
Now I get autoupdates for my PDS without needing to use the watchtower container along with logs that feed right into systemd. Additionally, my PDS storage itself ends up abstracted away for me in a podman volume, rather than being scattered in a random non-standard place on my VPS. In short, it does basically exactly what I want it to do, just the way I want it, with some learnings along the way.
[^1]: I've actually moved this server back over to an Ubuntu 24.04 deployment, due to RHEL-likes running a bit slow for me when it comes to package updates -- just kidding, it's back to using CentOS Stream now
[^2]: the oneshot method is a totally valid way of doing it to make it "easy", but it had too many quirks / compromises for my liking
Discussion in the ATmosphere