{
  "$type": "site.standard.document",
  "canonicalUrl": "https://c0nr.ad/posts/building-a-pds-on-kubernetes-from-scratch",
  "description": "A field-tested guide to building a k3s, Argo CD, Traefik, cert-manager, SealedSecrets, Longhorn, and Cloudflare-backed AT Protocol PDS deployment.",
  "path": "/posts/building-a-pds-on-kubernetes-from-scratch",
  "publishedAt": "2026-05-22T00:00:00.000Z",
  "site": "at://did:plc:3u3pfnao6uue6uh3x65bpo7l/site.standard.publication/3miinpqssxs2k",
  "tags": [
    "kubernetes",
    "atproto",
    "bluesky",
    "self-hosting"
  ],
  "textContent": "This guide starts with empty Git repositories and ends with a production-minded AT Protocol PDS at pds.dr0p.info. The target deployment shape is k3s, Argo CD, Traefik, cert-manager, SealedSecrets, Cloudflare DNS, Longhorn storage, Cloudflare R2 blob storage and backups, and GitOps-managed manifests.\n\nThis is my pds. There are many like it, but this one is mine. My pds is my...\nyou get it. The path is intentionally narrow. Alternatives are called out only where they help adapt the guide without changing the main design.\n\nThis is not the shortest way to run a PDS. The official Docker installer is much shorter. This guide is for the case where you already run Kubernetes, want a GitOps-shaped deployment, and care about understanding the storage, ingress, DNS, TLS, backup, and identity pieces.\n\nBy the end, these checks should pass:\n\n- DNS: pds.dr0p.info and .dr0p.info resolve to the public edge node.\n- TLS: cert-manager has issued pds-tls for .dr0p.info.\n- PDS health: https://pds.dr0p.info/xrpc/_health returns the running PDS version.\n- Server description: describeServer reports .dr0p.info as an available user domain and invite codes as required.\n- Account creation: goat can create the first account on the PDS.\n- Handle resolution: https://conrad.dr0p.info/.well-known/atproto-did returns the account DID.\n- Public identity: the Bluesky public API resolves conrad.dr0p.info to the same DID.\n- Firehose: com.atproto.sync.subscribeRepos accepts a WebSocket connection.\n- Backups: Litestream logs show successful SQLite replication, blobs are written directly to R2, and at least one SQLite restore test has been run.\n\nBefore copying commands, choose your values.\n\n| Value | Example in this guide | Used for |\n| --- | --- | --- |\n| Base domain | dr0p.info | DNS zone, handle suffix, ACME DNS-01 |\n| PDS hostname | pds.dr0p.info | PDS API, Bluesky custom hosting provider |\n| Handle suffix | .dr0p.info | User handles like conrad.dr0p.info |\n| Public edge IP | 76.154.145.38 | pds and wildcard DNS records |\n| WireGuard subnet | 10.50.0.0/24 | k3s node identity and cluster transport |\n| Internal LoadBalancer pool | 192.168.1.240-192.168.1.250 | MetalLB for private ingress |\n| Core repo | https://tangled.org/c0nr.ad/k8s-core | Argo CD platform source |\n| Apps repo | https://tangled.org/c0nr.ad/k8s-apps | Argo CD workload source |\n| R2 bucket | pds-dr0p-info | SQLite replicas and PDS blobs |\n| SMTP sender | pds@dr0p.info | Email confirmation and account operations |\n\nThe guide assumes the machines already exist and can reach each other over WireGuard. It does not teach WireGuard peer setup, router port forwarding, or cloud VM provisioning. It also assumes Argo CD can read your Git repositories; if your repos are private, configure Argo repository credentials before applying the root apps.\n\nThe live validation for this post ended with pds.dr0p.info serving PDS 0.4.219, conrad.dr0p.info resolving through the public ATProto identity API, email confirmation delivered through Resend, and Litestream replicating SQLite state to R2. Treat the pinned chart and image versions as a known-good snapshot, not as permanent recommendations.\n\n1. What We Are Building\n\n- Public PDS hostname: pds.dr0p.info\n- User handles: user.dr0p.info, not user.pds.dr0p.info\n- Kubernetes distribution: k3s\n- Deployment model: GitOps with Argo CD\n- Core repo: k8s-core\n- App repo: k8s-apps\n- Public ingress: Traefik on an external/public node\n- Internal ingress: Traefik on internal nodes\n- TLS: cert-manager with Cloudflare DNS-01\n- Secrets: Bitnami SealedSecrets\n- PDS database: SQLite on block storage\n- PDS blobs: Cloudflare R2 from day one\n- Backups: Litestream to Cloudflare R2 from day one\n- Account posture: public-ish, with invites/approval required\n\n2. Architecture\n\nBefore touching YAML, get the shape of the system straight. The PDS itself is a small service, but it sits at the intersection of public ingress, TLS, persistent storage, email, DNS, identity, and backups. The boring parts around it matter more than the container spec.\n\nThe cluster has two kinds of nodes.\n\n- Internal nodes run the Kubernetes control plane, normal workloads, Longhorn, and the PDS.\n- External nodes sit on the public internet and terminate public HTTP and HTTPS through Traefik.\n\nThe minimum lab is one internal k3s server plus one external k3s agent. The better production shape is three internal storage-capable nodes plus one external edge node. I am going to write the guide for the production shape, but keep notes where a two-node lab needs different settings.\n\nNode labels are the simple trick that keeps this understandable.\n\n- Internal nodes get node.kubernetes.io/network-location=internal.\n- External nodes get node.kubernetes.io/network-location=external.\n- Longhorn storage nodes also get node.longhorn.io/create-default-disk=true.\n- Stateful applications, including the PDS, should prefer or require internal.\n- Public Traefik should require external.\n- Internal Traefik should require internal.\n\nWireGuard is the cluster transport between internal and external nodes. The public edge node joins the same k3s cluster, but Kubernetes should talk to it over WireGuard rather than over the raw public internet. In the k3s install section, we will make the node IPs line up with the WireGuard addresses so flannel, kubelet, and service routing all use the private tunnel.\n\nThere are two ingress planes.\n\n- traefik-internal serves private names like argocd.local.dr0p.info and is reachable only on the private network.\n- traefik-external-public serves public names like pds.dr0p.info from the external edge node on host ports 80 and 443.\n\nMetalLB belongs on the internal side. It gives private LoadBalancer IPs to internal services. The external edge should not participate in MetalLB L2 announcements for the home LAN.\n\nCloudflare is responsible for public DNS and ACME DNS-01 validation.\n\n- pds.dr0p.info points at the public edge node.\n- User handles will be shaped like alice.dr0p.info.\n- For public-ish account creation, use wildcard handle routing for .dr0p.info to reach the PDS for ATProto handle verification.\n- Exact app hostnames must win over wildcard handle routing, so Traefik route priority matters.\n\nThere are two workable handle strategies.\n\n- Wildcard handle routing: send /.well-known/atproto-did requests for .dr0p.info to the PDS and let the PDS answer handle resolution requests.\n- Per-user DNS TXT: create _atproto.user.dr0p.info records for each account and only route pds.dr0p.info to the PDS.\n\nFor a public-ish PDS, wildcard handle routing is the better default. It needs wildcard DNS and wildcard TLS for .dr0p.info, not .pds.dr0p.info, because the handles are alice.dr0p.info, not alice.pds.dr0p.info.\n\nStorage is deliberately not NFS for the PDS database. The official PDS uses SQLite. SQLite wants local/block-device semantics, especially once write-ahead logging and backup tooling enter the picture. Longhorn gives us a Kubernetes-native ReadWriteOnce block volume for /pds. NFS can still be useful for media libraries, shared caches, or backup staging, but it is not the blessed path for PDS SQLite.\n\nField note from my first live deploy: I temporarily used an existing NFS storage class to get the PDS online while fixing Longhorn host prerequisites. It worked well enough to validate DNS, TLS, account creation, email, and federation, but it is still not the storage shape I would recommend as the durable design for SQLite. If you make the same temporary tradeoff, keep one replica, keep Recreate, and do not treat the result as highly available.\n\nThe PDS keeps SQLite state on the /pds volume, but writes blobs directly to Cloudflare R2. For a near-single-user PDS, the R2 cost is small enough that this is the cleaner production default: the PVC does not grow with media, disaster restore does not need a separate blob sync step, and there is no hourly backup sidecar to monitor. A local disk blobstore is still useful for a lab, but it is not the default path in this guide. Backups are not deferred: Litestream replicates SQLite state to Cloudflare R2 from day one.\n\nThe GitOps split is simple.\n\n- k8s-core owns cluster foundations: Argo CD entrypoints, cert-manager, SealedSecrets, Traefik, MetalLB, Longhorn, issuers, and storage classes.\n- k8s-apps owns workloads: the PDS app, its PVC, config, sealed secrets, ingress, certificate, backup config, and post-deploy notes.\n\nThe bootstrap phase should be short. Install k3s, install Argo CD once, point Argo at k8s-core, then let Argo converge the rest. After that, normal changes are Git commits.\n\nThe verification pattern for the rest of the guide is: test repository operations locally with fake remotes when possible, test Kubernetes manifests in a temporary k3d cluster when possible, and call out what those tests do not prove. k3d is not a substitute for real nodes, WireGuard, disks, or public ingress, but it catches a lot of broken YAML and wrong assumptions before they reach the actual cluster.\n\nFor a local end-to-end k3d run, use deliberate substitutions rather than pretending the lab is production: sslip.io hostnames instead of public DNS, a self-signed pds-tls Secret instead of ACME, a local PV instead of Longhorn, and MinIO instead of Cloudflare R2. The PDS manifests in this guide are plain Kubernetes resources and work well with a small Kustomize overlay for those local-only replacements.\n\n3. Prerequisites\n\nThis guide assumes very little on the workstation. I use a small Nix flake to pin the CLI tools, because debugging a cluster is annoying enough without also wondering which kubectl, k3d, or kubeseal happened to be first on PATH.\n\n- Domain: dr0p.info in Cloudflare\n- Cloudflare API token with DNS edit permissions for dr0p.info\n- Machines or VMs for k3s\n- A public node or VPS that can receive TCP 80 and 443\n- A WireGuard network between internal and external nodes\n- Cloudflare R2 bucket for Litestream backups and PDS blobs\n- SMTP provider for PDS email, such as Resend\n\nCr",
  "title": "Building a Bluesky PDS on Kubernetes From Scratch"
}