{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiaypup2jme7qvumi6wdnfeaop3gfwexe6nxe54zw4i5lnnnl6rwci",
"uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mmxbss3suah2"
},
"path": "/t/realtime-api-poor-portuguese-call-quality-with-gpt-realtime-mini-gpt-realtime/1381375#post_3",
"publishedAt": "2026-05-28T23:52:40.000Z",
"site": "https://community.openai.com",
"tags": [
"@leandro-ligmee",
"@rafa3"
],
"textContent": "Thanks for laying this out so clearly @leandro-ligmee, and good suggestion from @rafa3 on filtering/noise reduction.\n\nThis sounds less like one single model issue and more like the usual phone-call stack problem: speakerphone + background noise + μ-law audio + VAD deciding “that was enough speech” too early.\n\nA few things I’d try before changing models:\n\n * Enable the Realtime input noise reduction option if you are not already using it.\n * Pre-process audio before SIP/Realtime if possible: band-pass for voice, noise suppression, AGC, echo cancellation.\n * Raise `silence_duration_ms` a bit more for Portuguese phone calls, since users may pause mid-address or mid-name.\n * Consider setting `create_response: false` and manually creating the response only after you’re confident the user finished. That can reduce “AI starts talking by itself” cases.\n * For addresses/names, don’t rely on one pass. Ask for confirmation: “Did you say Rua X?” or collect critical fields twice in a structured way.\n * Add domain hints in the transcription prompt, like expected city names, street formats, common Brazilian names, etc. The generic “maximum fidelity” instruction may not be enough.\n\n\n\nI agree with @rafa3 that noise is probably the first thing to attack. If the mic input is messy, the transcription and VAD will both behave worse, even with better models.\n\nWould be useful to know whether you’re seeing more false starts during silence/background noise, or mostly while the user is still speaking. Those usually need slightly different fixes.\n\n-Mark G.",
"title": "Realtime API: Poor Portuguese call quality with gpt-realtime-mini / gpt-realtime"
}