External Publication

Framework/Thought Process of Diagnosing Prompt Issues

OpenAI Developer Community February 27, 2026

Often I’d be able to start writing a prompt (or generate one). The result ends up being a mix of some inaccurate tool calls, a little bit too wordy there, ignoring an instruction to bold letters. Since I don’t know why that happens, I’m mostly making uncalculated guesses at which parts to change.

Has anyone come up with a structured method of diagnosing prompts into broad categories ?

I was wondering if there are tell tale signs of certain mistakes, for eg not specifying a full “tool_name” may cause a model to consistently make incorrect choices.

By knowing what’s wrong, we could narrow down some prompting techniques to see if it solves the problem. For eg a category of instruction ignoring, one could use either a system prompt for higher priority, or applying markdown to make rules distinct.

-A junior dev

Discussion in the ATmosphere