{
"$type": "site.standard.document",
"canonicalUrl": "https://justingarrison.com/blog/petaflop-cluster",
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreihtdrmnyz3pu4kw6ojwsj3n3hqwupk5byavrhwwh737xnhq6cr5re"
},
"mimeType": "image/jpeg",
"size": 418690
},
"description": "AI is a pain in the back.",
"path": "/blog/petaflop-cluster",
"publishedAt": "2025-11-10T00:00:00.000Z",
"site": "at://did:plc:p7uix7mresfq4nfzxp3klgfa/site.standard.publication/3mmdn7mg2qm2d",
"textContent": "Instead of going to therapy I built another Kubernetes cluster.\n\nI got a NVIDIA DGX Spark and wanted to see what it's capable of.\nLots of people have done benchmarks and comparisons, but I needed to see what it practically felt like to build something.\n\nBecause it's a local, powerful AI computer I had to think of something AI related, and I wanted some way to show it's local.\nOf course this could all be faked, but it's a lot more fun to work with constraints.\n\n{{< youtube XudewmfourQ >}}\n\nIf you like this cluster you may also like my Cubernetes cluster built in an Apple G4 Cube.\n\nHardware list\n\n NVIDIA DGX Spark\n LattePanda IOTA (8GB) with m.2 expansion and UPS\n GL.iNet GL-A1300 travel router\n Takki 250W Portable Power Station\n GOTUS LED Sign\n 7 inch mini monitor\n Modular shelves 3D printed\n Henkelion Cat Backpack Carrier\n\nSoftware list\n\n Talos Linux with Omni\n ngrok\n ComfyUI\n FLUX.1-schnell model\n\nArchitucture\n\nIt's a pretty basic software stack.\n\n!A diagram with block components described in the below paragraph\n\nThe ngrok Kubernetes operator runs inside the cluster and provides an ingress to the workload.\nThe IOTA runs the Kubernetes control plane, ngrok operator, and application frontend.\nThe Spark _only_ runs ComfyUI to process the img2img jobs.\n\nI could have run everything on the Spark, but I kept needing to reinstall the Spark for a variety of reasons.\nI found it easier just to keep the frontend and Kubernetes on a dedicated system and route to the ComfyUI API on a separate machine.\n\nThe frontend app is 100% AI written.\nI knew roughly what I wanted with an img2img workflow, but I didn't know how to implement it locally.\n\nI spent the majority of my time learning ComfyUI, finding random models on Huggingface, finding broken links, and watching YouTube tutorials.\nThe AI ecosystem is a mess.\n\nI got comfortable enough to understand what I needed and then used Claude to help me figure out how to implement it.\n\nUsing the application\n\nI built the backpack so I could use it at Kubecon in Atlanta.\nThe idea was to just wear it all week and start conversations.\n\n!a screenshot of a webpage with a webcam with me with my thumb up\n\nI recharged it at the Sidero booth when the battery died, and wore it when it had power.\nThe Spark draws about 50w at idle so I estimated I would get about 3 hours of battery life.\nIn reality, I got more than 3 hours of usage and recharging from 30% took almost 2 hours.\nDuring the recharge the Spark had to be turned off.\n\nPeople could scan the QR code, take a picture, and get a stylized image back.\nBecause the backpack was on my back I didn't require them to talk to me, but of course there were lots of questions.\n\nThe only major problem I ran into was keeping the backpack connected to the internet.\nAt first I was teathering from my phone hotspot, but it was slow.\nI switched to using the conference wifi but it was spotty and frequently disconnected when walking around.\nI switched to using USB teathering on my phone and that was much more stable.\n\nThe build was a pretty simple 2 node Kubernetes cluster.\nThe hardest part was finding a backup big enough to show it off.",
"title": "PETaflop cluster"
}