What are considered "requests" when using the real-time API?
OpenAI Developer Community
May 13, 2026
Despite the model being “realtime”, that is just a buffering front end.
The model generates based on a set input context and a trigger. The trigger is either the end of server voice activity detection, or your response.create event. That is your request to receive a generated output, and what the rate limiter would error out on if over the limiter quota.
Discussion in the ATmosphere