Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicnosbzyxvlujt4qgok2q7xibofamvuwgxcsaces4ukfbaxxfwqia",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mjzcy23ikek2"
  },
  "path": "/t/an-utterly-dismayed-developer-can-not-comprehend-codexs-failures/1379313#post_3",
  "publishedAt": "2026-04-21T15:14:41.000Z",
  "site": "https://community.openai.com",
  "tags": [
    "Codex Prompting Guide",
    "Best practices for prompt engineering with the OpenAI AP",
    "@Tom.Fleet"
  ],
  "textContent": "Hey @Tom.Fleet, that sounds just as frustrating on a second read. When you’re deep in logs and it keeps repeating the same bad patterns, it really does feel like it’s ignoring you.\n\nWhat you’re seeing is a mix of prompt overload and weak uncertainty handling. The model tries to satisfy everything it’s told, and when those instructions clash or pile up, it starts making confident guesses instead of stopping. That’s where the “lying” feeling comes from.\n\nA couple of lighter tweaks that tend to help in practice: keep the core instructions tight and avoid stacking too many MD files, and break work into smaller fresh sessions so drift doesn’t build up. Not a full fix, but it usually reduces the chaos.\n\nYou might also find this useful since it goes pretty directly into how Codex behaves and how to structure prompts better:\n\nCodex Prompting Guide\n\nThere’s also a general best practices guide here:\n\nBest practices for prompt engineering with the OpenAI AP\n\nThis isn’t ideal, agreed. If you bring those logs back, that’ll make it easier to spot where it consistently breaks.\n\n-Mark G.",
  "title": "An utterly dismayed developer can not comprehend Codex's failures"
}