Raw Record Source

{
  "$type": "site.standard.document",
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreifs3g7n3moiuuozulr75vkmf77shachbdfpxcdos4oneot7snw5ra"
    },
    "mimeType": "image/webp",
    "size": 6248
  },
  "description": "Local voice transcription.\n\nThe way that the Whisper model works seems to be that real-time transcriptions are inæffective, so I shouldn’t bother looking for it. To achieve transcriptions practical for real-time communication, voice activity detection should be used to identify speech from silence.\n\nWhisper tends to hallucinate phrases when given i...",
  "path": "/Whisper",
  "publishedAt": "2026-06-06T18:27:23.000Z",
  "site": "at://did:plc:rfescy2ghdk6ma2wwwhr3bu2/site.standard.publication/3mktkmfk37k2g",
  "textContent": "\nLocal voice transcription.\n\nThe way that the *Whisper* model works seems to be that real-time transcriptions are inæffective, so I shouldn’t bother looking for it. To achieve transcriptions practical for real-time communication, voice activity detection should be used to identify speech from silence.\n\n*Whisper* tends to hallucinate phrases when given insufficient data, silence or short phrases. This includes:\n\n- “Thank you” or “Thanks for watching”.\n- “Sorry”.\n- Subtitle attribution, usually includes a domain name.\n\nPost-processing tends to be necessary for short speech.\n\n## Stream avatar\n\nI have ideas of using it for a stream/live camera avatar of sorts “*[[Sheep Zhing]]*.”. Speech-to-text-to-speech, essentially. This had rather humorous results but is terrible for clear communication.\n\n- [huwprosser/web-whisper](https://github.com/huwprosser/web-whisper) - *Python* backend.\n  - Model runs hot and occasionally locks up.\n  - Long payloads get rejected by the server.\n- Considering creating a separate *Node.js*/*Express* implementation that invokes a *Whisper* CLI tool instead.\n  For [*whisper.cpp*](https://github.com/ggml-org/whisper.cpp), this command might be sufficient.\n  `whisper-cli.exe -m ./models/tiny.en.bin -np -nt speech.wav -sns`",
  "title": "Whisper"
}