External Publication
Visit Post

Handling Overlapping Responses in Realtime API When Tools Take Too Long

OpenAI Developer Community May 28, 2026
Source

In summary, I implemented the following flow:

  1. Call the tool if it fires.

  2. Send a model preamble (e.g. “Let me check that for you”).

  3. Wait 3 seconds for the tool response.

  4. If the tool times out, return a tool response like this:

{
  "error": "TIMEOUT",
  "retries_remaining": 2,
  "next_action": "retry"
}

Then, send a prompt instructing the model to retry the same tool call with the same parameters (forcing the model to call the same tool again) and provide a preamble first.

  1. Set a second timeout of 4 seconds.

  2. Apply the same logic as in step 4.

  3. Set a final timeout of 5 seconds.

  4. Apply the same logic as in step 4, but since there are no retries left, return a tool response indicating the timeout error and send a prompt without tools, forcing the model to communicate the error to the user.

In this flow, if the user speaks while a tool execution is still in progress, I let the current execution finish. If the tool response arrives successfully, I append it to the conversation, and the model responds using both the tool response and the user’s latest message as context.

If the tool execution times out, I append a timeout error for that execution, and the model responds to the user’s latest message instead.

Discussion in the ATmosphere

Loading comments...