{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicec4hd3le4ykfuvq7em7iqswbnzdtagkaertcjvmxpfv56odyu2q",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mkrx3jgbib32"
  },
  "path": "/t/whisper-api-keeps-returning-empty-transcript-for-videos-longer-than-30-minutes-stuck-in-production/1380129#post_1",
  "publishedAt": "2026-05-01T09:28:03.000Z",
  "site": "https://community.openai.com",
  "textContent": "Hey everyone,\n\nRunning into a consistent issue with Whisper API and longer recordings and not sure what the right fix is.\n\nMy setup right now:\n\n  * Pull the Zoom recording → convert to MP3 with FFmpeg → compress under 25MB → send to Whisper API\n\n\n\nWorks fine for anything under 20 minutes. The moment I go past 30 minutes the transcript either comes back empty or just cuts off mid-sentence with no error message. Whisper just returns 200 with partial or no content.\n\nAlready tried a few things:\n\nSplitting into chunks — works but the speaker attribution gets completely lost between chunks and stitching the context back together is messy.\n\nLowering the bitrate more — quality drops so much that Whisper starts misidentifying words, especially with any background noise or non-native accents.\n\nSwitching to gpt-4o-transcribe — hit the 1500 second limit which is actually worse than Whisper for longer calls.\n\nThe real frustration is the entire pipeline assumes you have a small local file. For any real meeting or interview recording that is just not realistic without seriously degrading the audio.\n\nHas anyone figured out a solid approach for this? Ideally something that:\n\n  * Takes the recording URL directly without needing to download and re-encode\n  * Handles 60-90 minute recordings reliably\n  * Keeps speaker labels intact\n\n\n\nOpen to completely different approaches if Whisper just isn’t the right tool for this use case.",
  "title": "Whisper API keeps returning empty transcript for videos longer than 30 minutes — stuck in production"
}