Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidr72yg6revlm7zn6cupnibkmcytx2bxfsexvsbre4zlcvuumun2q",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mktgzpn3r7m2"
  },
  "path": "/t/how-do-i-make-stt-work-for-my-ai-vtuber-on-discord-vc-calls/175621#post_7",
  "publishedAt": "2026-05-02T00:41:13.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "In this case, the most critical issue is the “short duration of the audio,” so I don’t think adjusting the options passed to Whisper alone will solve the problem.\nYou’ll likely need to **make the WAV file itself longer**.\n\nSpecifically, given the number of samples in that WAV file, even if the sampling rate is 16 kHz, the audio duration is only about one second; if the sampling rate were higher, it would be less than one second.\n\n“Generating text from audio that’s less than a second long” is probably a bit outside the scope of the model’s design…",
  "title": "How Do i Make Stt Work for my ai Vtuber on Discord Vc calls?"
}