{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreidr72yg6revlm7zn6cupnibkmcytx2bxfsexvsbre4zlcvuumun2q",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mktgzpn3r7m2"
},
"path": "/t/how-do-i-make-stt-work-for-my-ai-vtuber-on-discord-vc-calls/175621#post_7",
"publishedAt": "2026-05-02T00:41:13.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "In this case, the most critical issue is the “short duration of the audio,” so I don’t think adjusting the options passed to Whisper alone will solve the problem.\nYou’ll likely need to **make the WAV file itself longer**.\n\nSpecifically, given the number of samples in that WAV file, even if the sampling rate is 16 kHz, the audio duration is only about one second; if the sampling rate were higher, it would be less than one second.\n\n“Generating text from audio that’s less than a second long” is probably a bit outside the scope of the model’s design…",
"title": "How Do i Make Stt Work for my ai Vtuber on Discord Vc calls?"
}