External Publication

Handling Overlapping Responses in Realtime API When Tools Take Too Long

OpenAI Developer Community May 28, 2026

In summary, I implemented the following flow:

Call the tool if it fires.
Send a model preamble (e.g. “Let me check that for you”).
Wait 3 seconds for the tool response.
If the tool times out, return a tool response like this:

{
  "error": "TIMEOUT",
  "retries_remaining": 2,
  "next_action": "retry"
}

Then, send a prompt instructing the model to retry the same tool call with the same parameters (forcing the model to call the same tool again) and provide a preamble first.

Set a second timeout of 4 seconds.
Apply the same logic as in step 4.
Set a final timeout of 5 seconds.
Apply the same logic as in step 4, but since there are no retries left, return a tool response indicating the timeout error and send a prompt without tools, forcing the model to communicate the error to the user.

In this flow, if the user speaks while a tool execution is still in progress, I let the current execution finish. If the tool response arrives successfully, I append it to the conversation, and the model responds using both the tool response and the user’s latest message as context.

If the tool execution times out, I append a timeout error for that execution, and the model responds to the user’s latest message instead.

Discussion in the ATmosphere