Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibgg3fzcuii3hvume7lmxdymvu5wefotdmci2nhvm2ahi7ztu7p4m",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mjx7qq4joha2"
  },
  "path": "/t/gpt-realtime-1-5-leaks-audio-control-tokens-audio-text-caption-quality-n-into-text-stream-when-run-with-modalities-text/1379235#post_2",
  "publishedAt": "2026-04-20T18:01:56.000Z",
  "site": "https://community.openai.com",
  "tags": [
    "@zaidkaraymeh"
  ],
  "textContent": "Welcome to the dev community, @zaidkaraymeh and thanks for the detailed repro, that’s really helpful.\n\nIf you’re able to share a request ID (and approximate timestamp) for one of these calls, it would make it much easier for the team to trace what’s happening internally. A short raw snippet of the streamed output (like the one you included) is perfect as well.\n\n~Smith",
  "title": "gpt-realtime-1.5 leaks audio control tokens (<|audio_text|>, <|caption_quality_N|>) into text stream when run with modalities=[\"text\"]"
}