{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreihugatkugtncy2ovtzcpyaoctf27oqavqxrtoviufgwgawrk7zfle",
"uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mlxep5rhxcs2"
},
"path": "/t/why-do-gpt-5-1-and-gpt-5-4-mini-behave-so-differently-in-production-chatbot-use-cases/1380891#post_8",
"publishedAt": "2026-05-16T07:21:15.000Z",
"site": "https://community.openai.com",
"textContent": "Yeah, in experimental phases for new features on Production I do similar. Start with large model, fine tune the code and prompts until I’m satisfied, then later step down the model via settings and see if I can retain acceptable behaviour until I find unacceptable cases, if any, then step back up.\n\nYou could do this in some kind of staging environment too if your risk tolerance is less, of course.",
"title": "Why do gpt-5.1 and gpt-5.4-mini behave so differently in production chatbot use cases?"
}