External Publication

Why do gpt-5.1 and gpt-5.4-mini behave so differently in production chatbot use cases?

OpenAI Developer Community May 15, 2026

Yes, I would definitely recommend testing different reasoning levels and model combinations to find the right balance between cost, quality, and latency. Even GPT-5.5 with reasoning set to none or low could be an option. Ultimately, you will need to evaluate which model and reasoning combination works best for your use case, either through proper evals or by testing it directly.

Discussion in the ATmosphere