“Background” requests latency
OpenAI Developer Community
March 30, 2026
Thanks for flagging this. The latency gap for background responses is still a known tradeoff, and we don’t have a completion date to share right now. I’m not aware of a model quality degradation specific to background mode; the main difference is request lifecycle and time-to-first-token, not output quality.
For long-running tasks, the most reliable pattern is still to use background mode with polling, or background streaming with reconnect via starting_after if your client can support it. Stream disconnects around ~5 minutes are typically caused by idle client/proxy timeouts rather than a hard response limit, so if you’re seeing that consistently, polling is the safer option today.
Discussion in the ATmosphere