External Publication

Module 'torchaudio' has no attribute 'AudioMetaData'

Hugging Face Forums [Unofficial] April 29, 2026

Maybe that new issue is likely a compatibility problem on the pyannote side. I don’t have much personal experience with pyannote myself, but I have used it while investigating migration issues. It’s a library very version sensitive where the usage itself tends to change significantly with each version update.

This isn’t limited to pyannote, but when updating libraries that are close to the backend, it’s best to proceed on the assumption that you’ll need to rewrite whole the model configurations and related execution code slightly within your scripts:

New errors after pinning pyannote/TorchAudio: causes and fixes

Short version

You made progress.

The original problem was:

AttributeError: module 'torchaudio' has no attribute 'AudioMetaData'

That was the TorchAudio 2.9+ compatibility problem. Pinning back to the Torch 2.8 / TorchAudio 2.8 generation gets you past that layer.

Now you have a different problem:

TypeError: Pipeline.from_pretrained() got an unexpected keyword argument 'token'

This is not the same error. This one is a pyannote API mismatch.

Your dependency resolver says:

brouhaha==0.9.0 depends on pyannote-audio==3.3.0

So your environment is now effectively pinned to:

pyannote.audio==3.3.0

But your code is calling pyannote like this:

pipeline = Pipeline.from_pretrained(MODEL_ID, token=tokens["diarization"])

and it is loading:

pyannote/speaker-diarization-community-1

That is the newer pyannote 4.x / Community-1 style. It does not match the pyannote.audio==3.3.0 API that brouhaha forces.

The immediate fix is:

MODEL_ID = "pyannote/speaker-diarization-3.1"

pipeline = Pipeline.from_pretrained(
    MODEL_ID,
    use_auth_token=tokens["diarization"],
)

Do not use token= with pyannote.audio==3.3.0.

Do not use speaker-diarization-community-1 while you are on the brouhaha / pyannote 3.3 recovery path.

Useful references:

pyannote.audio 3.3.0 docs on PyPI
pyannote/speaker-diarization-3.1 model card
pyannote/speaker-diarization-community-1 model card
pyannote.audio releases
TorchAudio 2.8 list_audio_backends deprecation docs
TorchAudio 2.8 deprecation overview
TorchCodec compatibility table
uv script locking docs

What caused the first new error?

You got this resolver error:

× No solution found when resolving script dependencies:
╰─▶ Because only brouhaha==0.9.0 is available and brouhaha==0.9.0 depends on pyannote-audio==3.3.0,
    we can conclude that all versions of brouhaha depend on pyannote-audio==3.3.0.
    And because you require pyannote-audio==3.4.0 and brouhaha, we can conclude that your
    requirements are unsatisfiable.

This means uv is doing the correct thing.

You asked for:

pyannote-audio==3.4.0

but your local brouhaha package requires:

pyannote-audio==3.3.0

Those two cannot both be true.

So changing:

pyannote-audio==3.4.0

to:

pyannote-audio==3.3.0

was a reasonable fix.

But that change has an important consequence:

You are now on the pyannote 3.3 API.

That means the rest of the code must also use the pyannote 3.3 call style.

What caused the second new error?

You then got:

Loading diarization pipeline pyannote/speaker-diarization-community-1...
Traceback (most recent call last):
  File "/home/user/diarization/repos/scripts/diaritranscribe3.py", line 621, in <module>
    main()
  File "/home/user/diarization/repos/scripts/diaritranscribe3.py", line 589, in main
    diarization = diarize_audio(
                  ^^^^^^^^^^^^^^
  File "/home/user/diarization/repos/scripts/diaritranscribe3.py", line 208, in diarize_audio
    pipeline = Pipeline.from_pretrained(MODEL_ID, token=tokens["diarization"])
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Pipeline.from_pretrained() got an unexpected keyword argument 'token'

The key line is:

Pipeline.from_pretrained(MODEL_ID, token=tokens["diarization"])

The token= keyword is the newer call style. It appears in current Community-1 examples.

But pyannote.audio==3.3.0 expects the older keyword:

use_auth_token=

So this:

pipeline = Pipeline.from_pretrained(
    MODEL_ID,
    token=tokens["diarization"],
)

should become this:

pipeline = Pipeline.from_pretrained(
    MODEL_ID,
    use_auth_token=tokens["diarization"],
)

That is the direct fix for the unexpected keyword argument 'token' error.

The model ID is probably wrong for this recovery path too

Your log says:

Loading diarization pipeline pyannote/speaker-diarization-community-1...

That is another mismatch.

For pyannote.audio==3.3.0, use:

MODEL_ID = "pyannote/speaker-diarization-3.1"

not:

MODEL_ID = "pyannote/speaker-diarization-community-1"

The speaker-diarization-community-1 pipeline belongs to the newer pyannote 4.x era. It is documented with token=..., output.speaker_diarization, and output.exclusive_speaker_diarization.

The pyannote 3.3 path is different. It uses speaker-diarization-3.1, use_auth_token=..., and the returned object is usually iterated with:

for turn, _, speaker in diarization.itertracks(yield_label=True):
    ...

References:

pyannote.audio 3.3.0 docs
pyannote/speaker-diarization-3.1
pyannote/speaker-diarization-community-1
pyannote.audio releases: Community-1 and exclusive diarization

The TorchAudio warning is expected

This warning:

/home/rodrigo/.cache/uv/environments-v2/diaritranscribe3-3f9949c47f20e532/lib/python3.12/site-packages/pyannote/audio/core/io.py:212: UserWarning: torchaudio._backend.list_audio_backends has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release.
  torchaudio.list_audio_backends()

is not the current crash.

It means:

pyannote.audio 3.3.0 is calling an old TorchAudio API.
TorchAudio 2.8 still has that API, but warns that it will disappear in 2.9.

That warning is exactly why you should not upgrade TorchAudio to 2.9 in this recovery path.

Keep:

torch==2.8.0
torchaudio==2.8.0

TorchAudio 2.8 warns. TorchAudio 2.9 removes. For old pyannote code, a warning is better than a missing attribute crash.

Relevant references:

TorchAudio 2.8 list_audio_backends deprecation docs
TorchAudio 2.8 deprecation overview
PyTorch Audio issue: TorchAudio future / TorchCodec transition

Recommended current fix

Use the pyannote 3.3-compatible dependency set

Given your brouhaha constraint, use this dependency block:

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10,<3.14"
# dependencies = [
#   "faster-whisper",
#   "numpy",
#   "pyannote.audio==3.3.0",
#   "scikit-learn",
#   "torch==2.8.0",
#   "torchaudio==2.8.0",
#   "torchcodec==0.7.*",
#   "omegaconf",
#   "brouhaha @ file:///home/user/diarization/repos/.venv/brouhaha-vad",
# ]
# ///

Why:

Package	Reason
`pyannote.audio==3.3.0`	Required by your local `brouhaha==0.9.0` package.
`torch==2.8.0`	Coherent with TorchAudio 2.8 and TorchCodec 0.7.
`torchaudio==2.8.0`	Keeps deprecated APIs available instead of removed.
`torchcodec==0.7.*`	TorchCodec’s compatibility table maps `0.7` to Torch `2.8`.
`faster-whisper`	Keep it for transcription, but debug it separately from pyannote.
No manual `nvidia-*` packages	Avoid mixing CUDA generations while fixing pyannote import and model loading.

Useful references:

TorchAudio installation compatibility notes
TorchCodec compatibility table
uv scripts guide

Recommended code patch

Find your current code around line 208:

pipeline = Pipeline.from_pretrained(MODEL_ID, token=tokens["diarization"])

Change it to:

pipeline = Pipeline.from_pretrained(
    MODEL_ID,
    use_auth_token=tokens["diarization"],
)

Also change the model ID.

If you currently have:

MODEL_ID = "pyannote/speaker-diarization-community-1"

change it to:

MODEL_ID = "pyannote/speaker-diarization-3.1"

A compact pyannote 3.3-compatible function would look like:

from pyannote.audio import Pipeline
import torch

MODEL_ID = "pyannote/speaker-diarization-3.1"

def diarize_audio(audio_path, tokens):
    print(f"Loading diarization pipeline {MODEL_ID}...")

    pipeline = Pipeline.from_pretrained(
        MODEL_ID,
        use_auth_token=tokens["diarization"],
    )

    if torch.cuda.is_available():
        pipeline.to(torch.device("cuda"))

    diarization = pipeline(audio_path)

    return diarization

Then, when reading the result:

for turn, _, speaker in diarization.itertracks(yield_label=True):
    print(f"{turn.start:.2f} {turn.end:.2f} {speaker}")

This matches the pyannote 3.x style.

Why it still happens after “reverting” the script

There are a few likely reasons.

1. You changed the environment, not just the file

Even if you revert part of diaritranscribe3.py, your dependency environment still contains:

pyannote.audio==3.3.0

because brouhaha requires it.

So token= will keep failing until the code matches pyannote 3.3.

Check the actual runtime version:

from importlib.metadata import version

print("pyannote.audio:", version("pyannote.audio"))

Expected now:

pyannote.audio: 3.3.0

If that is the version, use:

use_auth_token=

not:

token=

2. Your `MODEL_ID` may still point to Community-1

Search your script:

grep -n "speaker-diarization" diaritranscribe3.py

For the recovery path, it should show:

pyannote/speaker-diarization-3.1

not:

pyannote/speaker-diarization-community-1

3. Your script may still contain `token=`

Search:

grep -n "token=" diaritranscribe3.py

For the pyannote call, change:

token=tokens["diarization"]

to:

use_auth_token=tokens["diarization"]

Do not necessarily change every token= in the whole script. Other libraries may still use a token keyword. The specific problem is the pyannote 3.3 call to Pipeline.from_pretrained.

4. uv may be reusing a cached script environment

Use refresh while testing:

uv run --refresh --script diaritranscribe3.py

Then inspect the dependency tree:

uv tree --script diaritranscribe3.py

You want to see something close to:

pyannote.audio==3.3.0
torch==2.8.0
torchaudio==2.8.0
torchcodec==0.7.x

Once it works, lock it:

uv lock --script diaritranscribe3.py

Reference:

uv locking script dependencies

Two coherent paths from here

Path A — recommended now: stay with `brouhaha` and pyannote 3.3

Choose this if your priority is to get the current script working.

Use:

pyannote.audio==3.3.0
torch==2.8.0
torchaudio==2.8.0
torchcodec==0.7.*

Use model:

MODEL_ID = "pyannote/speaker-diarization-3.1"

Use auth keyword:

use_auth_token=tokens["diarization"]

Use output iteration:

for turn, _, speaker in diarization.itertracks(yield_label=True):
    ...

This is the low-risk recovery path because it respects the brouhaha dependency pin.

Path B — later migration: use Community-1 and pyannote 4.x

Choose this if you want the newer pyannote stack and are willing to deal with migration work.

You would need to remove or modify the brouhaha constraint first. Options:

Remove brouhaha.
Replace brouhaha with another VAD path.
Fork/edit your local brouhaha package so it does not require pyannote-audio==3.3.0.
Update brouhaha, if a newer compatible version exists in your local project.
Split the environment so brouhaha and modern pyannote are not forced into the same dependency graph.

Then you can move toward:

pipeline = Pipeline.from_pretrained(
    "pyannote/speaker-diarization-community-1",
    token=tokens["diarization"],
)

and newer output handling:

output = pipeline(audio_path)

for turn, speaker in output.speaker_diarization:
    print(turn.start, turn.end, speaker)

# If available and useful for transcript alignment:
for turn, speaker in output.exclusive_speaker_diarization:
    print(turn.start, turn.end, speaker)

But treat this as a real migration. It may involve:

TorchCodec;
FFmpeg;
newer pyannote output objects;
new model access requirements;
possibly higher VRAM use;
different diarization output behavior;
changes to transcript/speaker alignment code.

Useful references:

pyannote/speaker-diarization-community-1
Community-1 launch post
pyannote.audio releases
TorchCodec README

Immediate diagnostic checklist

Run these in order.

1. Confirm versions

Add this temporarily near the top of the script:

from importlib.metadata import version
import torch
import torchaudio

print("pyannote.audio:", version("pyannote.audio"))
print("torch:", torch.__version__)
print("torchaudio:", torchaudio.__version__)
print("torchcodec:", version("torchcodec"))
print("AudioMetaData exists:", hasattr(torchaudio, "AudioMetaData"))

Expected for the recovery path:

pyannote.audio: 3.3.0
torch: 2.8.0...
torchaudio: 2.8.0...
torchcodec: 0.7...
AudioMetaData exists: True

If torchaudio is 2.9.x, you are back in the danger zone.

2. Confirm model ID

For Path A, use:

MODEL_ID = "pyannote/speaker-diarization-3.1"

not:

MODEL_ID = "pyannote/speaker-diarization-community-1"

3. Confirm auth keyword

For Path A, use:

pipeline = Pipeline.from_pretrained(
    MODEL_ID,
    use_auth_token=tokens["diarization"],
)

not:

pipeline = Pipeline.from_pretrained(
    MODEL_ID,
    token=tokens["diarization"],
)

4. Confirm access to gated models

For speaker-diarization-3.1, make sure the Hugging Face account behind your token has accepted the relevant model conditions.

Common symptoms of missing access are different from your current error. They look more like:

401 Unauthorized
403 Forbidden
Repository not found
Could not download pipeline

Useful links:

pyannote/speaker-diarization-3.1
pyannote/segmentation-3.0
Hugging Face access tokens docs

5. Refresh uv while testing

uv run --refresh --script diaritranscribe3.py

Then inspect:

uv tree --script diaritranscribe3.py

Then lock after success:

uv lock --script diaritranscribe3.py

What not to do right now

Do not upgrade TorchAudio to silence the warning.

This warning:

torchaudio._backend.list_audio_backends has been deprecated

does not mean:

upgrade torchaudio

In this case it means:

you are using legacy pyannote code that still works on TorchAudio 2.8, but will break on TorchAudio 2.9

So for the recovery path, keep:

torchaudio==2.8.0

Do not switch back to:

token=tokens["diarization"]

unless you migrate to a pyannote version that supports it.

Do not use:

pyannote/speaker-diarization-community-1

unless you deliberately move to the newer pyannote 4.x path.

Do not reintroduce mixed CUDA packages while debugging this pyannote problem. CUDA can be debugged after pyannote loads.

Final recommended state for your current script

Use this dependency block:

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10,<3.14"
# dependencies = [
#   "faster-whisper",
#   "numpy",
#   "pyannote.audio==3.3.0",
#   "scikit-learn",
#   "torch==2.8.0",
#   "torchaudio==2.8.0",
#   "torchcodec==0.7.*",
#   "omegaconf",
#   "brouhaha @ file:///home/user/diarization/repos/.venv/brouhaha-vad",
# ]
# ///

Use this model ID:

MODEL_ID = "pyannote/speaker-diarization-3.1"

Use this load call:

pipeline = Pipeline.from_pretrained(
    MODEL_ID,
    use_auth_token=tokens["diarization"],
)

Use this output style:

for turn, _, speaker in diarization.itertracks(yield_label=True):
    print(f"{turn.start:.2f} {turn.end:.2f} {speaker}")

Then run:

uv run --refresh --script diaritranscribe3.py

Inspect:

uv tree --script diaritranscribe3.py

Lock:

uv lock --script diaritranscribe3.py

Bottom line

You now have two separate compatibility boundaries:

TorchAudio boundary Old pyannote code needs TorchAudio 2.8, not 2.9+. The deprecation warning is expected. Keep torchaudio==2.8.0.
pyannote API boundary brouhaha forces pyannote.audio==3.3.0. That means use speaker-diarization-3.1 and use_auth_token=..., not speaker-diarization-community-1 and token=....

So the practical fix is:

pyannote.audio==3.3.0
torch==2.8.0
torchaudio==2.8.0
torchcodec==0.7.*
MODEL_ID = "pyannote/speaker-diarization-3.1"
Pipeline.from_pretrained(..., use_auth_token=...)

Treat community-1 + token=... as a later migration path after you remove, update, or fork the brouhaha dependency that pins pyannote to 3.3.0.

New errors after pinning pyannote/TorchAudio: causes and fixes

Short version

What caused the first new error?

What caused the second new error?

The model ID is probably wrong for this recovery path too

The TorchAudio warning is expected

Recommended current fix

Use the pyannote 3.3-compatible dependency set

Recommended code patch

Why it still happens after “reverting” the script

1. You changed the environment, not just the file

2. Your MODEL_ID may still point to Community-1

3. Your script may still contain token=

4. uv may be reusing a cached script environment

Two coherent paths from here

Path A — recommended now: stay with brouhaha and pyannote 3.3

Path B — later migration: use Community-1 and pyannote 4.x

Immediate diagnostic checklist

1. Confirm versions

2. Confirm model ID

3. Confirm auth keyword

4. Confirm access to gated models

5. Refresh uv while testing

What not to do right now

Final recommended state for your current script

Bottom line

Discussion in the ATmosphere

2. Your `MODEL_ID` may still point to Community-1

3. Your script may still contain `token=`

Path A — recommended now: stay with `brouhaha` and pyannote 3.3