{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreif2ju2h2nb43jlm3mlon5zfqvlmy6ipwygr7i2pvtsmdgsxvmzb3m",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3moepyl7zn5f2"
  },
  "path": "/t/a-10kb-page-that-sings-text-can-you-decode-it-back-pitch-only-baseline-already-hits-43/176840#post_1",
  "publishedAt": "2026-06-16T00:53:31.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "loom — the booth",
    "loom/bench at main · evengineer1ng/loom · GitHub"
  ],
  "textContent": "This is a self-contained 10kb web page that computes text into sung audio. No model, no server, just deterministic logic. Each word maps to a fixed little motif, so the same word always sounds the same. Play with it: loom — the booth\n\nHere’s the part I can’t answer alone. It’s reversible. I wrote a benchmark that mints unlimited (text + audio) pairs, and a dumb pitch-only baseline already recovers 43% of words from sound alone, and it’s throwing the vowels away.\n\nSo, how high can you push it? If a model gets near 100%, the meaning survives all the way into sound and back, which makes this a lossless code, no weights involved. And then the question I actually care about: where’s the line between a code you can decode and a language? I don’t know. I’d rather find out than guess.\n\nBeat 43% loom/bench at main · evengineer1ng/loom · GitHub\n\nOr just listen and tell me what you hear in it, it’s been only me on this for a while.",
  "title": "A ~10KB page that sings text — can you decode it back? (pitch-only baseline already hits 43%)"
}