Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicz77mawcfxc2ylnibfdxky2n7sdu2jb2yxamxvv7k6tno4nse6qa",
    "uri": "at://did:plc:ysiyu76vdhdrm25dpbktdrzf/app.bsky.feed.post/3mn6prbesjrwb"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreiaij6rw635s4wbbtnq3sphdrxt5quxbqzcygzyo6eb3uo5uwhu52a"
    },
    "mimeType": "image/jpeg",
    "size": 574905
  },
  "description": "Last month, a Cloudflare engineer used AI to rebuild a major open-source project in under a week. Cost about $1,100. It shipped to production. Around the same time, a different Cloudflare team used AI to fork another project and silently stripped out the security protections the original maintainers had spent years adding. That also shipped....",
  "path": "/blog/ai-fills-the-gaps-just-not-at-depth",
  "publishedAt": "2026-05-24T19:43:59.000Z",
  "site": "at://did:plc:ysiyu76vdhdrm25dpbktdrzf/site.standard.publication/3mn6pmpzdzl6f",
  "tags": [
    "Design &amp; Dev"
  ],
  "textContent": "Last month, a Cloudflare engineer used AI to rebuild a major open-source project in under a week. Cost about $1,100. It shipped to production. Around the same time, a different Cloudflare team used AI to fork another project and silently stripped out the security protections the original maintainers had spent years adding. That also shipped. Both things are true. The gap between them is the part of the AI conversation nobody wants to look at. The trend “Everyone is a builder” is the operating thesis behind a real reorg wave. The SF Standard wrote it up in March. Meta PMs calling themselves AI builders. LinkedIn is rebranding its APM program to Associate Product Builder. Figma’s Dylan Field on Lenny’s Podcast saying roles are shifting and merging, with 56% of non-designers now doing design tasks regularly. AI genuinely lowers the cost of producing working software, and handoff overhead between roles is real. One builder with AI can sometimes do what used to take a small team. I’m not here to argue against any of that. It’s happening, and some version of it is going to stick. What I want to talk about is the assumption underneath it that nobody is naming. The assumption The builder model assumes AI closes the depth gap. When a designer expands into engineering, or an engineer into design, the implicit claim is that AI fills in what they don’t know deeply. The builder brings judgment. AI brings the domain. The whole reorg sits on this. If it’s true, the model works. If it isn’t, you get a specific kind of failure that’s already documented. Why it fails AI’s output quality is bounded by the reviewer’s ability to evaluate it. A senior engineer using AI in their own domain catches the 30% it got wrong. The same engineer using AI outside their domain ships the wrong 30% confidently. Nothing in the output signals indicates which part is broken. It reads as competent to anyone not deep enough to push back. The Cloudflare slop fork is what this looks like in public. Work shipped, looked production-grade, security gaps weren’t caught because catching them needed depth that wasn’t in the loop. Builder.io’s writeup landed on it: “the durable layer sits above the code.” The same pattern shows up in design systems work, just less visibly. An AI reviewer catches obvious token violations. It doesn’t catch that someone reinvented a button pattern that already exists, or that a new component contradicts an architectural principle the system was built on, or that an interaction pattern is technically valid but breaks how the rest of the system handles that case. Those require knowing the system, not just reading the diff. Same failure mode as the slop fork. Output that’s locally consistent and globally wrong. Pragmatic Engineer’s 2026 survey found the same pattern. Less adept engineers uplevel their output with AI, but they generate “a lot of AI slop while doing so.” Output volume up. Quality variance is way up. A Microsoft engineering post in January described how it shows up in production: “Latency was fine. The error rate was low. Dashboards were green. Then a single workflow started creating the wrong tickets, not failing or crashing. It was confidently doing the wrong thing at scale.” That’s the failure mode. Generalists can ship. The problem is when nobody in the review loop has depth in the relevant layer: the output looks fine, it ships, and when something breaks three weeks later, it takes time to trace because nothing flagged a problem at the time. What it actually takes The builder model can work. It doesn’t work on its own. Someone with depth has to own each system layer. That’s still a role, whether or not it shows up on the org chart. Design systems are the clearest case because the failure is visible: AI-generated UI that looks right, uses the right tokens, ships, and quietly drifts the product away from itself across a hundred small decisions nobody had depth to catch. Depth doesn’t distribute across a team that doesn’t have it. You can give everyone the builder title. The layer still needs someone who actually knows it. If that person isn’t in the loop, the review is theater. “Done” has to mean evaluated by someone who could tell if it’s wrong. If the review is just other builders without depth, you have velocity and no quality floor. The honest version Companies doing this well say it out loud: we’re moving to builders, we’re keeping deep ownership of each system layer, and we’re being explicit about what review looks like when builders work outside their depth. The ones doing it badly skip that conversation. Everyone’s a builder now, AI fills the gaps, here are your new titles. They’ll ship more for a while. Then the incidents start, they’ll be architectural, and the dashboards will be green right up until they aren’t. There’s more AI-assisted work flowing through fewer people who understand any given layer deeply. Depth doesn’t become less necessary. It becomes harder to find and more expensive when it’s missing. Design systems were always this layer. The people carrying that depth aren’t a holdover from the old model. They’re load-bearing whether the org chart shows it or not.",
  "title": "AI fills the gaps. Just not at depth",
  "updatedAt": "2026-05-24T21:42:47.000Z"
}