Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieyqlvomzyy4zcvqumpkazewdcs6wi7edqflaojqfbq7vab64omqe",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mktgztqn7fm2"
  },
  "path": "/t/how-do-i-make-stt-work-for-my-ai-vtuber-on-discord-vc-calls/175621#post_6",
  "publishedAt": "2026-05-02T00:30:06.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Transcribed text: ‘’\nYou:\nSending to Ollama: ‘…’\nOllama response status: 200\nOllama response data: {‘model’: ‘drivedenpadev/deepseek-v3.2’, ‘created_at’: ‘2026-05-02T00:29:13.5354636Z’, ‘response’: “What’s good, chat? Ready to get this conversation started!”, ‘done’: True, ‘done_reason’: ‘stop’, ‘context’: [128006, 9125, 128007, 271, 38766, 1303, 33025, 2696, 25, 6790, 220, 2366, 18, 271, 2675, 527, 264, 11919, 18328, 13, 128009, 128006, 882, 128007, 1432, 2675, 527, 264, 15526, 34051, 30970, 13, 13969, 31737, 1234, 220, 975, 4339, 13, 2360, 100166, 13, 3298, 3823, 1432, 1502, 25, 720, 15836, 25, 128009, 128006, 78191, 128007, 271, 3923, 596, 1695, 11, 6369, 30, 32082, 311, 636, 420, 10652, 3940, 0], ‘total_duration’: 763554400, ‘load_duration’: 115772300, ‘prompt_eval_count’: 56, ‘prompt_eval_duration’: 45008000, ‘eval_count’: 14, ‘eval_duration’: 592494200}\nExtracted response: ‘What’s good, chat? Ready to get this conversation started!’\nAI: What’s good, chat? Ready to get this conversation started!\nSanitized text: ‘What’s good, chat? Ready to get this conversation started!’\nGenerating audio…\n2026-05-01 17:29:14,307 - WARNING - CFG, min_p and exaggeration are not supported by Turbo version and will be ignored.\n9%|██████▉ | 86/1000 [00:04<00:46, 19.59it/s]\nS3 Token → Mel Inference…\n100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.69it/s]\nTTS result: sr=24000, audio_shape=(86400,)\nTTS generated successfully: C:\\Users\\…\\AppData\\Local\\Temp\\tmpqxnxny4k.wav\n[CONNECT] (‘127.0.0.1’, 60151)\nTranscribing…\n[SEARCH] Audio analysis - size: 5760, max_amplitude: 1.304917\n[MIC] Incoming audio | amp=1.304917 | samples=5760\n[PROCESS] Processing audio: 5760 samples, 1.304917 max amplitude\n[STT] Transcribing with improved local STT…\nIncoming audio | amp=1.304917 | samples=5760\nProcessing audio: 5760 samples, 1.304917 max amplitude\nTranscribing with simple Whisper STT…\nTranscription error: name ‘transcribe_audio’ is not defined\nTraceback (most recent call last):\nFile “C:\\Users\\…\\Downloads\\AliTurbo\\vtuber_core_fixed.py”, line 130, in safe_transcribe\nNameError: name ‘transcribe_audio’ is not defined\nTranscribed text: ‘’\nYou:\nSending to Ollama: ‘…’\nOllama response status: 200\nOllama response data: {‘model’: ‘drivedenpadev/deepseek-v3.2’, ‘created_at’: ‘2026-05-02T00:29:28.6229976Z’, ‘response’: “What’s up, newbie? Ready to get this chat started?”, ‘done’: True, ‘done_reason’: ‘stop’, ‘context’: [128006, 9125, 128007, 271, 38766, 1303, 33025, 2696, 25, 6790, 220, 2366, 18, 271, 2675, 527, 264, 11919, 18328, 13, 128009, 128006, 882, 128007, 1432, 2675, 527, 264, 15526, 34051, 30970, 13, 13969, 31737, 1234, 220, 975, 4339, 13, 2360, 100166, 13, 3298, 3823, 1432, 1502, 25, 720, 15836, 25, 128009, 128006, 78191, 128007, 271, 3923, 596, 709, 11, 95678, 30, 32082, 311, 636, 420, 6369, 3940, 30], ‘total_duration’: 712363500, ‘load_duration’: 85605400, ‘prompt_eval_count’: 56, ‘prompt_eval_duration’: 45423700, ‘eval_count’: 14, ‘eval_duration’: 569795200}\nExtracted response: ‘What’s up, newbie? Ready to get this chat started?’\nAI: What’s up, newbie? Ready to get this chat started?\nSanitized text: ‘What’s up, newbie? Ready to get this chat started?’\nGenerating audio…\n\nstill looking at it erm",
  "title": "How Do i Make Stt Work for my ai Vtuber on Discord Vc calls?"
}