{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreihuo6dqssvvhtkrj2y4aioec2jbovcx75exzhgv2lmes42tkomkl4",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkni3jmfkhs2"
},
"path": "/t/module-torchaudio-has-no-attribute-audiometadata/175647#post_4",
"publishedAt": "2026-04-29T15:33:52.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"pyannote.audio 3.3.0 docs on PyPI",
"pyannote/speaker-diarization-3.1 model card",
"pyannote/speaker-diarization-community-1 model card",
"pyannote.audio releases",
"TorchAudio 2.8 list_audio_backends deprecation docs",
"TorchAudio 2.8 deprecation overview",
"TorchCodec compatibility table",
"uv script locking docs",
"pyannote.audio 3.3.0 docs",
"pyannote/speaker-diarization-3.1",
"pyannote/speaker-diarization-community-1",
"pyannote.audio releases: Community-1 and exclusive diarization",
"PyTorch Audio issue: TorchAudio future / TorchCodec transition",
"TorchAudio installation compatibility notes",
"uv scripts guide",
"uv locking script dependencies",
"Community-1 launch post",
"TorchCodec README",
"pyannote/segmentation-3.0",
"Hugging Face access tokens docs"
],
"textContent": "Maybe that new issue is likely a compatibility problem on the `pyannote` side.\nI don’t have much personal experience with `pyannote` myself, but I have used it while investigating migration issues. It’s a library very version sensitive where the usage itself tends to change significantly with each version update.\n\nThis isn’t limited to `pyannote`, but **when updating libraries that are close to the backend, it’s best to proceed on the assumption that you’ll need to rewrite whole the model configurations and related execution code** slightly within your scripts:\n\n* * *\n\n# New errors after pinning pyannote/TorchAudio: causes and fixes\n\n## Short version\n\nYou made progress.\n\nThe original problem was:\n\n\n AttributeError: module 'torchaudio' has no attribute 'AudioMetaData'\n\n\nThat was the **TorchAudio 2.9+ compatibility problem**. Pinning back to the Torch 2.8 / TorchAudio 2.8 generation gets you past that layer.\n\nNow you have a different problem:\n\n\n TypeError: Pipeline.from_pretrained() got an unexpected keyword argument 'token'\n\n\nThis is not the same error. This one is a **pyannote API mismatch**.\n\nYour dependency resolver says:\n\n\n brouhaha==0.9.0 depends on pyannote-audio==3.3.0\n\n\nSo your environment is now effectively pinned to:\n\n\n pyannote.audio==3.3.0\n\n\nBut your code is calling pyannote like this:\n\n\n pipeline = Pipeline.from_pretrained(MODEL_ID, token=tokens[\"diarization\"])\n\n\nand it is loading:\n\n\n pyannote/speaker-diarization-community-1\n\n\nThat is the newer pyannote 4.x / Community-1 style. It does not match the `pyannote.audio==3.3.0` API that `brouhaha` forces.\n\nThe immediate fix is:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n use_auth_token=tokens[\"diarization\"],\n )\n\n\nDo **not** use `token=` with `pyannote.audio==3.3.0`.\n\nDo **not** use `speaker-diarization-community-1` while you are on the `brouhaha` / pyannote 3.3 recovery path.\n\nUseful references:\n\n * pyannote.audio 3.3.0 docs on PyPI\n * pyannote/speaker-diarization-3.1 model card\n * pyannote/speaker-diarization-community-1 model card\n * pyannote.audio releases\n * TorchAudio 2.8 list_audio_backends deprecation docs\n * TorchAudio 2.8 deprecation overview\n * TorchCodec compatibility table\n * uv script locking docs\n\n\n\n* * *\n\n# What caused the first new error?\n\nYou got this resolver error:\n\n\n × No solution found when resolving script dependencies:\n ╰─▶ Because only brouhaha==0.9.0 is available and brouhaha==0.9.0 depends on pyannote-audio==3.3.0,\n we can conclude that all versions of brouhaha depend on pyannote-audio==3.3.0.\n And because you require pyannote-audio==3.4.0 and brouhaha, we can conclude that your\n requirements are unsatisfiable.\n\n\nThis means uv is doing the correct thing.\n\nYou asked for:\n\n\n pyannote-audio==3.4.0\n\n\nbut your local `brouhaha` package requires:\n\n\n pyannote-audio==3.3.0\n\n\nThose two cannot both be true.\n\nSo changing:\n\n\n pyannote-audio==3.4.0\n\n\nto:\n\n\n pyannote-audio==3.3.0\n\n\nwas a reasonable fix.\n\nBut that change has an important consequence:\n\n\n You are now on the pyannote 3.3 API.\n\n\nThat means the rest of the code must also use the pyannote 3.3 call style.\n\n* * *\n\n# What caused the second new error?\n\nYou then got:\n\n\n Loading diarization pipeline pyannote/speaker-diarization-community-1...\n Traceback (most recent call last):\n File \"/home/user/diarization/repos/scripts/diaritranscribe3.py\", line 621, in <module>\n main()\n File \"/home/user/diarization/repos/scripts/diaritranscribe3.py\", line 589, in main\n diarization = diarize_audio(\n ^^^^^^^^^^^^^^\n File \"/home/user/diarization/repos/scripts/diaritranscribe3.py\", line 208, in diarize_audio\n pipeline = Pipeline.from_pretrained(MODEL_ID, token=tokens[\"diarization\"])\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n TypeError: Pipeline.from_pretrained() got an unexpected keyword argument 'token'\n\n\nThe key line is:\n\n\n Pipeline.from_pretrained(MODEL_ID, token=tokens[\"diarization\"])\n\n\nThe `token=` keyword is the newer call style. It appears in current Community-1 examples.\n\nBut `pyannote.audio==3.3.0` expects the older keyword:\n\n\n use_auth_token=\n\n\nSo this:\n\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n token=tokens[\"diarization\"],\n )\n\n\nshould become this:\n\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n use_auth_token=tokens[\"diarization\"],\n )\n\n\nThat is the direct fix for the `unexpected keyword argument 'token'` error.\n\n* * *\n\n# The model ID is probably wrong for this recovery path too\n\nYour log says:\n\n\n Loading diarization pipeline pyannote/speaker-diarization-community-1...\n\n\nThat is another mismatch.\n\nFor `pyannote.audio==3.3.0`, use:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n\n\nnot:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-community-1\"\n\n\nThe `speaker-diarization-community-1` pipeline belongs to the newer pyannote 4.x era. It is documented with `token=...`, `output.speaker_diarization`, and `output.exclusive_speaker_diarization`.\n\nThe pyannote 3.3 path is different. It uses `speaker-diarization-3.1`, `use_auth_token=...`, and the returned object is usually iterated with:\n\n\n for turn, _, speaker in diarization.itertracks(yield_label=True):\n ...\n\n\nReferences:\n\n * pyannote.audio 3.3.0 docs\n * pyannote/speaker-diarization-3.1\n * pyannote/speaker-diarization-community-1\n * pyannote.audio releases: Community-1 and exclusive diarization\n\n\n\n* * *\n\n# The TorchAudio warning is expected\n\nThis warning:\n\n\n /home/rodrigo/.cache/uv/environments-v2/diaritranscribe3-3f9949c47f20e532/lib/python3.12/site-packages/pyannote/audio/core/io.py:212: UserWarning: torchaudio._backend.list_audio_backends has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release.\n torchaudio.list_audio_backends()\n\n\nis not the current crash.\n\nIt means:\n\n\n pyannote.audio 3.3.0 is calling an old TorchAudio API.\n TorchAudio 2.8 still has that API, but warns that it will disappear in 2.9.\n\n\nThat warning is exactly why you should **not** upgrade TorchAudio to 2.9 in this recovery path.\n\nKeep:\n\n\n torch==2.8.0\n torchaudio==2.8.0\n\n\nTorchAudio 2.8 warns. TorchAudio 2.9 removes. For old pyannote code, a warning is better than a missing attribute crash.\n\nRelevant references:\n\n * TorchAudio 2.8 list_audio_backends deprecation docs\n * TorchAudio 2.8 deprecation overview\n * PyTorch Audio issue: TorchAudio future / TorchCodec transition\n\n\n\n* * *\n\n# Recommended current fix\n\n## Use the pyannote 3.3-compatible dependency set\n\nGiven your `brouhaha` constraint, use this dependency block:\n\n\n #!/usr/bin/env -S uv run --script\n # /// script\n # requires-python = \">=3.10,<3.14\"\n # dependencies = [\n # \"faster-whisper\",\n # \"numpy\",\n # \"pyannote.audio==3.3.0\",\n # \"scikit-learn\",\n # \"torch==2.8.0\",\n # \"torchaudio==2.8.0\",\n # \"torchcodec==0.7.*\",\n # \"omegaconf\",\n # \"brouhaha @ file:///home/user/diarization/repos/.venv/brouhaha-vad\",\n # ]\n # ///\n\n\nWhy:\n\nPackage | Reason\n---|---\n`pyannote.audio==3.3.0` | Required by your local `brouhaha==0.9.0` package.\n`torch==2.8.0` | Coherent with TorchAudio 2.8 and TorchCodec 0.7.\n`torchaudio==2.8.0` | Keeps deprecated APIs available instead of removed.\n`torchcodec==0.7.*` | TorchCodec’s compatibility table maps `0.7` to Torch `2.8`.\n`faster-whisper` | Keep it for transcription, but debug it separately from pyannote.\nNo manual `nvidia-*` packages | Avoid mixing CUDA generations while fixing pyannote import and model loading.\n\nUseful references:\n\n * TorchAudio installation compatibility notes\n * TorchCodec compatibility table\n * uv scripts guide\n\n\n\n* * *\n\n# Recommended code patch\n\nFind your current code around line 208:\n\n\n pipeline = Pipeline.from_pretrained(MODEL_ID, token=tokens[\"diarization\"])\n\n\nChange it to:\n\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n use_auth_token=tokens[\"diarization\"],\n )\n\n\nAlso change the model ID.\n\nIf you currently have:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-community-1\"\n\n\nchange it to:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n\n\nA compact pyannote 3.3-compatible function would look like:\n\n\n from pyannote.audio import Pipeline\n import torch\n\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n\n def diarize_audio(audio_path, tokens):\n print(f\"Loading diarization pipeline {MODEL_ID}...\")\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n use_auth_token=tokens[\"diarization\"],\n )\n\n if torch.cuda.is_available():\n pipeline.to(torch.device(\"cuda\"))\n\n diarization = pipeline(audio_path)\n\n return diarization\n\n\nThen, when reading the result:\n\n\n for turn, _, speaker in diarization.itertracks(yield_label=True):\n print(f\"{turn.start:.2f} {turn.end:.2f} {speaker}\")\n\n\nThis matches the pyannote 3.x style.\n\n* * *\n\n# Why it still happens after “reverting” the script\n\nThere are a few likely reasons.\n\n## 1. You changed the environment, not just the file\n\nEven if you revert part of `diaritranscribe3.py`, your dependency environment still contains:\n\n\n pyannote.audio==3.3.0\n\n\nbecause `brouhaha` requires it.\n\nSo `token=` will keep failing until the code matches pyannote 3.3.\n\nCheck the actual runtime version:\n\n\n from importlib.metadata import version\n\n print(\"pyannote.audio:\", version(\"pyannote.audio\"))\n\n\nExpected now:\n\n\n pyannote.audio: 3.3.0\n\n\nIf that is the version, use:\n\n\n use_auth_token=\n\n\nnot:\n\n\n token=\n\n\n* * *\n\n## 2. Your `MODEL_ID` may still point to Community-1\n\nSearch your script:\n\n\n grep -n \"speaker-diarization\" diaritranscribe3.py\n\n\nFor the recovery path, it should show:\n\n\n pyannote/speaker-diarization-3.1\n\n\nnot:\n\n\n pyannote/speaker-diarization-community-1\n\n\n* * *\n\n## 3. Your script may still contain `token=`\n\nSearch:\n\n\n grep -n \"token=\" diaritranscribe3.py\n\n\nFor the pyannote call, change:\n\n\n token=tokens[\"diarization\"]\n\n\nto:\n\n\n use_auth_token=tokens[\"diarization\"]\n\n\nDo not necessarily change every `token=` in the whole script. Other libraries may still use a `token` keyword. The specific problem is the pyannote 3.3 call to `Pipeline.from_pretrained`.\n\n* * *\n\n## 4. uv may be reusing a cached script environment\n\nUse refresh while testing:\n\n\n uv run --refresh --script diaritranscribe3.py\n\n\nThen inspect the dependency tree:\n\n\n uv tree --script diaritranscribe3.py\n\n\nYou want to see something close to:\n\n\n pyannote.audio==3.3.0\n torch==2.8.0\n torchaudio==2.8.0\n torchcodec==0.7.x\n\n\nOnce it works, lock it:\n\n\n uv lock --script diaritranscribe3.py\n\n\nReference:\n\n * uv locking script dependencies\n\n\n\n* * *\n\n# Two coherent paths from here\n\n## Path A — recommended now: stay with `brouhaha` and pyannote 3.3\n\nChoose this if your priority is to get the current script working.\n\nUse:\n\n\n pyannote.audio==3.3.0\n torch==2.8.0\n torchaudio==2.8.0\n torchcodec==0.7.*\n\n\nUse model:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n\n\nUse auth keyword:\n\n\n use_auth_token=tokens[\"diarization\"]\n\n\nUse output iteration:\n\n\n for turn, _, speaker in diarization.itertracks(yield_label=True):\n ...\n\n\nThis is the low-risk recovery path because it respects the `brouhaha` dependency pin.\n\n* * *\n\n## Path B — later migration: use Community-1 and pyannote 4.x\n\nChoose this if you want the newer pyannote stack and are willing to deal with migration work.\n\nYou would need to remove or modify the `brouhaha` constraint first. Options:\n\n 1. Remove `brouhaha`.\n 2. Replace `brouhaha` with another VAD path.\n 3. Fork/edit your local `brouhaha` package so it does not require `pyannote-audio==3.3.0`.\n 4. Update `brouhaha`, if a newer compatible version exists in your local project.\n 5. Split the environment so `brouhaha` and modern pyannote are not forced into the same dependency graph.\n\n\n\nThen you can move toward:\n\n\n pipeline = Pipeline.from_pretrained(\n \"pyannote/speaker-diarization-community-1\",\n token=tokens[\"diarization\"],\n )\n\n\nand newer output handling:\n\n\n output = pipeline(audio_path)\n\n for turn, speaker in output.speaker_diarization:\n print(turn.start, turn.end, speaker)\n\n # If available and useful for transcript alignment:\n for turn, speaker in output.exclusive_speaker_diarization:\n print(turn.start, turn.end, speaker)\n\n\nBut treat this as a real migration. It may involve:\n\n * TorchCodec;\n * FFmpeg;\n * newer pyannote output objects;\n * new model access requirements;\n * possibly higher VRAM use;\n * different diarization output behavior;\n * changes to transcript/speaker alignment code.\n\n\n\nUseful references:\n\n * pyannote/speaker-diarization-community-1\n * Community-1 launch post\n * pyannote.audio releases\n * TorchCodec README\n\n\n\n* * *\n\n# Immediate diagnostic checklist\n\nRun these in order.\n\n## 1. Confirm versions\n\nAdd this temporarily near the top of the script:\n\n\n from importlib.metadata import version\n import torch\n import torchaudio\n\n print(\"pyannote.audio:\", version(\"pyannote.audio\"))\n print(\"torch:\", torch.__version__)\n print(\"torchaudio:\", torchaudio.__version__)\n print(\"torchcodec:\", version(\"torchcodec\"))\n print(\"AudioMetaData exists:\", hasattr(torchaudio, \"AudioMetaData\"))\n\n\nExpected for the recovery path:\n\n\n pyannote.audio: 3.3.0\n torch: 2.8.0...\n torchaudio: 2.8.0...\n torchcodec: 0.7...\n AudioMetaData exists: True\n\n\nIf `torchaudio` is `2.9.x`, you are back in the danger zone.\n\n* * *\n\n## 2. Confirm model ID\n\nFor Path A, use:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n\n\nnot:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-community-1\"\n\n\n* * *\n\n## 3. Confirm auth keyword\n\nFor Path A, use:\n\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n use_auth_token=tokens[\"diarization\"],\n )\n\n\nnot:\n\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n token=tokens[\"diarization\"],\n )\n\n\n* * *\n\n## 4. Confirm access to gated models\n\nFor `speaker-diarization-3.1`, make sure the Hugging Face account behind your token has accepted the relevant model conditions.\n\nCommon symptoms of missing access are different from your current error. They look more like:\n\n\n 401 Unauthorized\n 403 Forbidden\n Repository not found\n Could not download pipeline\n\n\nUseful links:\n\n * pyannote/speaker-diarization-3.1\n * pyannote/segmentation-3.0\n * Hugging Face access tokens docs\n\n\n\n* * *\n\n## 5. Refresh uv while testing\n\n\n uv run --refresh --script diaritranscribe3.py\n\n\nThen inspect:\n\n\n uv tree --script diaritranscribe3.py\n\n\nThen lock after success:\n\n\n uv lock --script diaritranscribe3.py\n\n\n* * *\n\n# What not to do right now\n\nDo not upgrade TorchAudio to silence the warning.\n\nThis warning:\n\n\n torchaudio._backend.list_audio_backends has been deprecated\n\n\ndoes not mean:\n\n\n upgrade torchaudio\n\n\nIn this case it means:\n\n\n you are using legacy pyannote code that still works on TorchAudio 2.8, but will break on TorchAudio 2.9\n\n\nSo for the recovery path, keep:\n\n\n torchaudio==2.8.0\n\n\nDo not switch back to:\n\n\n token=tokens[\"diarization\"]\n\n\nunless you migrate to a pyannote version that supports it.\n\nDo not use:\n\n\n pyannote/speaker-diarization-community-1\n\n\nunless you deliberately move to the newer pyannote 4.x path.\n\nDo not reintroduce mixed CUDA packages while debugging this pyannote problem. CUDA can be debugged after pyannote loads.\n\n* * *\n\n# Final recommended state for your current script\n\nUse this dependency block:\n\n\n #!/usr/bin/env -S uv run --script\n # /// script\n # requires-python = \">=3.10,<3.14\"\n # dependencies = [\n # \"faster-whisper\",\n # \"numpy\",\n # \"pyannote.audio==3.3.0\",\n # \"scikit-learn\",\n # \"torch==2.8.0\",\n # \"torchaudio==2.8.0\",\n # \"torchcodec==0.7.*\",\n # \"omegaconf\",\n # \"brouhaha @ file:///home/user/diarization/repos/.venv/brouhaha-vad\",\n # ]\n # ///\n\n\nUse this model ID:\n\n\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n\n\nUse this load call:\n\n\n pipeline = Pipeline.from_pretrained(\n MODEL_ID,\n use_auth_token=tokens[\"diarization\"],\n )\n\n\nUse this output style:\n\n\n for turn, _, speaker in diarization.itertracks(yield_label=True):\n print(f\"{turn.start:.2f} {turn.end:.2f} {speaker}\")\n\n\nThen run:\n\n\n uv run --refresh --script diaritranscribe3.py\n\n\nInspect:\n\n\n uv tree --script diaritranscribe3.py\n\n\nLock:\n\n\n uv lock --script diaritranscribe3.py\n\n\n* * *\n\n# Bottom line\n\nYou now have two separate compatibility boundaries:\n\n 1. **TorchAudio boundary**\nOld pyannote code needs TorchAudio 2.8, not 2.9+.\nThe deprecation warning is expected. Keep `torchaudio==2.8.0`.\n\n 2. **pyannote API boundary**\n`brouhaha` forces `pyannote.audio==3.3.0`.\nThat means use `speaker-diarization-3.1` and `use_auth_token=...`, not `speaker-diarization-community-1` and `token=...`.\n\n\n\n\nSo the practical fix is:\n\n\n pyannote.audio==3.3.0\n torch==2.8.0\n torchaudio==2.8.0\n torchcodec==0.7.*\n MODEL_ID = \"pyannote/speaker-diarization-3.1\"\n Pipeline.from_pretrained(..., use_auth_token=...)\n\n\nTreat `community-1` + `token=...` as a later migration path after you remove, update, or fork the `brouhaha` dependency that pins pyannote to 3.3.0.",
"title": "Module 'torchaudio' has no attribute 'AudioMetaData'"
}