{
"$type": "site.standard.document",
"description": "Different LLMs think differently. When one gets stuck, it tends to bang its head against the wall — trying the same approach over and over. A second",
"path": "/using-a-second-llm-to-review-your-coding-agent-s-work/",
"publishedAt": "2026-02-02T04:39:00.000Z",
"site": "at://did:plc:bryys25pc2fnagnyxqgsglhd/site.standard.publication/3mn26bjkkmh23",
"tags": [
"AI",
"Tools"
],
"textContent": "Different LLMs think differently. When one gets stuck, it tends to bang its head against the wall — trying the same approach over and over. A second model often sees the problem from a different angle and breaks through.\n\nI use Droid (running Opus) as my primary coding agent and Codex as a reviewer. The idea is simple: after Droid makes changes, I have Codex review the dirty diff before I commit. I set this up using skills — reusable prompt snippets that all 3 of my agents understand.\n\nTHE \"REVIEW DIRTY\" SKILL\n\nThis skill takes all uncommitted changes and sends them to Codex for review:\n\n---\nname: review-dirty\ndescription: Review dirty code changes. When user say to \"review\"\n or \"review changes\" or \"review dirty code\"\n---\n\nAll dirty repo changes are likely made in this session,\nthough not always\n\nif you are Codex, just review the dirty code and ignore the\nrest in this skill. If you are not Codex, continue:\n\nDo not modify anything unless I tell you to. Run this cli\ncommand (using codex as our reviewer) passing in the original\nprompt to review the changes: `codex exec \"Do not modify\nanything unless I tell you to. Review the dirty repo changes\nwhich are to implement: <prompt>\"`. $ARGUMENTS. Do it with\nBash tool. Make sure if there's a timeout to be at least 10\nminutes.\n\nThe if you are Codex guard is because when Droid calls codex exec, Codex picks up the same skill files. Without it, Codex would try to call itself recursively.\n\nI just say \"review\" or \"review dirty\" and Droid shells out to Codex, which reads the git diff and gives its assessment.\n\nTAKING IT FURTHER: REVIEW AND FIX IN A LOOP\n\nOnce you have a review skill, the next step is obvious — automate the fix-review cycle:\n\n---\nname: review-plus-fix-relentlessly\ndescription: Review dirty code and fix iteratively. When user\n say to \"loop to fix dirty\" or \"review+fix\"\n---\n\nAll dirty repo changes are likely made in this session,\nthough not always\n\nUse the review dirty skill to review changes and fix to your\nbest ability and matching repo preferences and style.\n\nAfter fixing, run review-plus-fix-relentlessly again, and\nbefore each cycle report how many cycles of review+fix we\nhave done.\n\nStop if code review skill doesn't not produce any more things\nto fix\n\nThis creates a loop: Droid makes changes, Codex reviews, Droid fixes what Codex flagged, Codex reviews again. It keeps going until Codex has nothing left to flag.\n\nWHY THIS WORKS\n\nThe value isn't just catching bugs — it's that the two models have different blind spots. Claude might over-engineer a solution while Codex points out a simpler approach. Or Droid might miss an edge case that Codex catches because it's looking at the code fresh.\n\nThe review loop usually converges in 2-3 cycles.\n\nSETUP\n\nI wrote about skills and my shared agent setup previously. Drop the skill files in your skills directory and they're available to all your agents.\n\nFor this to work, you need both Droid (or Claude Code) and Codex installed, with Codex accessible via codex exec from the command line.",
"title": "Using a Second LLM to Review Your Coding Agent's Work"
}