Raw Record Source

{
  "$type": "site.standard.document",
  "canonicalUrl": "https://afloat.boats/posts/code-accountability-2026",
  "path": "/posts/code-accountability-2026",
  "publishedAt": "2026-02-28T07:50:46.000Z",
  "site": "at://did:plc:pans3xjam4khj7y54dx7gtfg/site.standard.publication/3mdqevmg6w32c",
  "tags": [
    "engineering",
    "software"
  ],
  "textContent": "In the year of our Lord Claude Code\n\nI think it is important for reasons of fairness that if a human won’t review their own AI-generated code, we cannot expect other humans to do so either. The time required to conduct a thorough human review is much higher than the time it takes Claude to generate tens of thousands of lines of code. This wouldn't have been a problem if it were trivial to identify the parts of code that are good and the parts that need to be improved. If 5% of something is bad, and it's non-trivial to identify which 5%, then the whole is unusable.\nSeparating the wheat from the chaff is the whole point of a code-review.\n\nFalse starts\n\nOne might come up with elaborate schemes to keep up with the influx of slop. Here is an approach that, logistical blunders aside, misses the fundamental goal by a large margin.\n\nMy bot fights your bot\n\nA response might be that reviewers should incorporate AI to review code.\nThis is similar to the follies of Lord Dorwin in the book _Foundation_.\n\n> Hardin continued: \"...Lord Dorwin thought the way to be a good archaeologist was to read all the books on the subject—written by men who were dead for centuries. He thought that the way to solve archaeological puzzles was to weigh the opposing authorities...\n>\n> ― Isaac Asimov, _Foundation_\n\nIn the case of a code-review, the archaeological puzzles are the code-changes and their underlying motivation. An AI model is probabilistically operating on context that may or may not be effectively encoded in the code or the comments.\nLayered on top is the reviewers' own lack of understanding of the model entrusted with the task.\nThe lack of pre-existing context, combined with a model's own probabilistic nature, effectively renders an AI generated code-review of AI generated code insufficient.\n\nI don’t think that’s currently possible to conduct a rigorous code-review, as AI models aren’t able to evaluate business logic, architecture, and abstractions to the same level that a human with much greater amounts of context on the same problem might. Partly as all the context isn't, and potentially cannot be encoded in the form of types/names/functions/documentation etc.\n\nBut then again, why put a human in the loop at all?\n\n> A computer can never be held accountable.\n> Therefore a computer must never make a management decision.\n>\n> – _internal IBM training_\n\nWhich brings me to the thing that I've been wondering about for the last few ~~weeks~~ months.\n\nWhat is a code-review?\n\nCode-review, IMO, is an instrument for growing our shared understanding of the problem-space, building trust in the code we ship, and in each other. It's also a framework for establishing accountability. The primary person accountable is the author, but some share of accountability lies with the reviewers.\nAccountability is one of the cornerstones of civil society, we like to believe that the world is _more or less_ a just place due to the fact that we have measures in place to hold each other accountable.\nCode-review is one such measure on a much smaller scale, i.e., a codebase.\nWith the advent of vibe-coding and vibe-reviewing, we've temporarily decided to ignore the fundamental reason why code-reviews exist.\n\nWith whom does the burden of proof lie?\n\n> “What can be asserted without evidence can also be dismissed without evidence.”\n>\n> – Christopher Hitchens\n\nWhen a PR gets submitted without evidence backing that it improves upon the state of the codebase, it can be rejected without evidence, especially when creating imagined dragons is incredibly cheap, and disproving their existence is materially difficult.\n\nIs the human who reviews the PR using AI responsible for approving the PR?\nIs now the entire team responsible for maintaining the tens of thousands of lines of code added to the repository?\nWhen the application goes down, who you gonna call, Ghostbusters? And can the losses due to downtime be billed to _frontier_ model companies?\n\nClosing thoughts\n\nMy opinions on this are in-flux to the point that I started writing this at the end of January when the open source world started to reckon with the onset of AI-drive-bys [\\[1\\]](https://github.com/ghostty-org/ghostty/pull/10412) [\\[2\\]](https://github.com/tldraw/tldraw/issues/7695). Instead of this _art_ being finished, it's merely being abandoned on the 15th of May.\nArguments over LOCs and granularity of code-reviews are, IMO, futile devices. Instead, we must re-examine what code-reviews stand for, whether they still serve us, and how must they change to continue to be effective.",
  "title": "Code Accountability - 2026"
}