{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreif2ed62c254pdcxfb257du6xkbsioksuoz2tuxvx2vaz3yrczmwva",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkplb3g6eor2"
},
"path": "/t/module-torchaudio-has-no-attribute-audiometadata/175647#post_7",
"publishedAt": "2026-04-30T11:58:13.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "This technically solved my problem, as rewritting the script around that starting point you made worked\nFor posterity, the changes I made were:\n\nadding a slight block of code at the start to bypass a couple errors:\n\n\n def patched_forward(self, sequences, weights=None):\n mean = sequences.mean(dim=-1)\n if sequences.size(-1) > 1:\n std = sequences.std(dim=-1, correction=1)\n else:\n std = torch.zeros_like(mean)\n return torch.cat([mean, std], dim=-1)\n\n StatsPool.forward = patched_forward\n torch.backends.cuda.matmul.allow_tf32 = False\n torch.backends.cudnn.allow_tf32 = False\n\n\na small change to the assign_speaker_to_segment function to account for multiple segments of the same speaker\n\n\n def assign_speaker_to_segment(segment_start, segment_end, diarization_turns):\n best_speaker = None\n best_overlap = 0.0\n speakerdict = {}\n for speaker in diarization_turns:\n speakerdict[speaker[2]] = 0.0\n for turn_start, turn_end, speaker in diarization_turns:\n speakerdict[speaker] += overlap_seconds(segment_start, segment_end, turn_start, turn_end)\n overlap = speakerdict[speaker]\n if overlap > best_overlap:\n best_overlap = overlap\n best_speaker = speaker\n\n return best_speaker or \"UNKNOWN\"\n\n\nAnd a small change to the token function.\n\nUnfortunately, this script is just a cleaner version of the previous iteration of my script, and the current itteration was meant to solve a problem regarding diarization errors themselves. For now, thank you, and I will eventualy open a topic with the next step once I figure out how to formulate the problem.",
"title": "Module 'torchaudio' has no attribute 'AudioMetaData'"
}