{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiaqdb7icccosx6gqswnl7qtd5azbs3leqc6govjfpraf3hmnn3kwa",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mf533aablcb2"
  },
  "path": "/t/realtime-api-bug-websocket-stops-receiving-messages-mid-session-audioworklet-sending-audio-but-no-server-events-fired/1374520#post_1",
  "publishedAt": "2026-02-18T11:48:58.000Z",
  "site": "https://community.openai.com",
  "textContent": "We’re experiencing an intermittent but reproducible issue with the OpenAI Realtime API where the session freezes at a random point during a conversation. The assistant responds correctly several times, but at some point stops reacting to the user’s audio entirely.\n\n**Environment**\n\n  * OpenAI Realtime API via WebSocket\n\n  * Browser-based app using AudioWorklet for microphone capture\n\n  * Production environment\n\n\n\n\n**Steps to reproduce**\n\n  1. Start a Realtime API session\n\n  2. Have a normal back-and-forth conversation — the assistant responds correctly multiple times\n\n  3. At a random point, attempt to speak after the assistant finishes a response\n\n  4. The session freezes — more likely to happen when there is external noise present (e.g. keyboard typing)\n\n\n\n\n**Expected behavior** The assistant consistently detects speech and fires `input_audio_buffer.speech_started`, then responds.\n\n**Actual behavior** At a random point during the conversation, the WebSocket goes completely silent after an assistant response. No further server events are received — no `speech_started`, no `speech_stopped`, nothing. The AudioWorklet continues processing and sending audio chunks, but the server never acknowledges them. The session appears alive on the client side but is effectively frozen. The only recovery is a full session restart.\n\n**What we observe in the logs** After the freeze, every single audio chunk coming from the AudioWorklet is flagged as silent, even when the user is speaking loudly. This continues indefinitely with no server-side reaction.\n\n\n    [AudioWorklet] First audio chunk is silent (max: 41)\n    [AudioWorklet] First audio chunk is silent (max: 78)\n    [AudioWorklet] First audio chunk is silent (max: 96)\n    ... (hundreds of consecutive entries)\n\n\n**What we have ruled out**\n\n  * The WebSocket connection does not drop or throw an error\n\n  * The issue is not related to audio volume — speaking louder does not recover the session\n\n  * It is not consistently tied to switching applications, though that increases the likelihood\n\n  * The assistant does respond correctly multiple times before the freeze occurs\n\n\n\n\n**Production-specific code that may be relevant**\n\nWe noticed this issue only occurs in production. Our production build has three mechanisms that do not exist in our development build:\n\n**1. Client-side VAD gate** — audio chunks with RMS energy below a threshold are replaced with zeros before being sent to OpenAI:\n\nts\n\n\n    const isSilent = this.lastMicrophoneEnergy < this.clientVADThreshold // 0.03\n    const dataToSend = isSilent ? new Int16Array(audioData.length) : audioData\n\n\n**2. Higher interruption energy threshold** — production requires significantly more energy to consider user speech as an interruption while the assistant is speaking:\n\nts\n\n\n    // Production\n    private readonly INTERRUPTION_ENERGY_THRESHOLD = 0.1\n\n    // Development\n    private readonly INTERRUPTION_ENERGY_THRESHOLD = 0.02\n\n\n**3. Stuck VAD watchdog** — if the server remains in `speech_started` state for 30 seconds without transitioning, production sends an `input_audio_buffer.clear` to reset it:\n\nts\n\n\n    this.stuckVADTimeout = setTimeout(() => {\n      this.gatewayClient.sendOpenAIMessage({ type: 'input_audio_buffer.clear' })\n    }, 30_000)\n\n\nOur theory is that external noise (e.g. keyboard) triggers `speech_started` on the server, the client VAD gate then starts sending zeros, and this combination leaves the server VAD in an inconsistent state from which it never recovers — resulting in the frozen session.\n\n**Questions for the community**\n\n  * Has anyone experienced the WebSocket entering a state where it appears connected but stops receiving server events?\n\n  * Is there a known issue with the server VAD getting stuck after receiving a mix of real audio and silence (zeros)?\n\n  * Are there recommended patterns for detecting and recovering from this frozen state without doing a full session restart?\n\n  * We are considering migrating from WebSocket to WebRTC for the Realtime API. Has anyone made this migration and found it more stable for browser-based use cases? Did it solve similar freezing or silent-session issues? Are there trade-offs or known limitations we should be aware of before committing to that approach?\n\n\n",
  "title": "REALTIME API BUG - WebSocket stops receiving messages mid-session — AudioWorklet sending audio but no server events fired"
}