“Background” requests latency
This message was added months ago at https://developers.openai.com/api/docs/guides/background:
Currently, the time to first token you receive from a background response is higher than what you receive from a synchronous one. We are working to reduce this latency gap in the coming weeks.
Hey @OpenAI_Support, Can you please confirm if you are still working to reduce this latency gap. If yes, can you please share the completion date, or if not, share the quality degradation that one may face that is recognized but without mitigation currently?
Then, you have “streaming” of the background response. If attempted, this was terminating the stream connection within five minutes. The same documentation page used to have “resuming the background streaming is coming”, but that future promise of a feature that could make this useful is now gone.
Discussion in the ATmosphere