{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreief4fxiz2hqb7i52oqntzjfd3rg7j6k4avig6ibhwzphnkmasg5w4",
"uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mmqrpxocike2"
},
"path": "/t/new-audio-model-snapshots-in-the-realtime-api/1369374#post_20",
"publishedAt": "2026-05-26T10:22:16.000Z",
"site": "https://community.openai.com",
"textContent": "I’m honestly torn. The 35% WER improvement and multilingual gains in 2025-12-15 are exactly what we need; we’re building a language-learning product that touches a lot of low-resource languages, so those gains really matter.\n\nBut our testing this week hit the truncation bug at ~20% on multi-sentence inputs (3 of 16 fixtures, en/es/fr/de). Same input, retried, returned complete audio. One failure was 3.9s of speech followed by 1.6s of trailing silence; the stream is ending early and getting zero-padded, HTTP 200, no error, no retry signal. A silent 1-in-5 failure rate that no amount of WER improvement compensates for.\n\nThe original report on this is from June 2025 — almost a year now. With 2025-03-20 shutting down July 23, those of us pinning the old snapshot have ~2 months before we’re forced onto a snapshot we can’t reliably ship on.\n\nHonest, friendly question: could one of you just open a Codex session and ask it to fix this?",
"title": "New Audio Model Snapshots in the Realtime-API"
}