Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicrxxeub5mn3mk5noukv7xatulah5carcjiulyyu4idigjxgjwtn4",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mgsjjgy6qok2"
  },
  "path": "/t/valueerror-loading-helsinki-nlp-tokenizers/174192#post_1",
  "publishedAt": "2026-03-11T15:43:55.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Python 3.12.3\ntorch==2.10.0\ntransformers=5.3.0\n\nMy aim is to load the tokenizer(s) for Helsinki-NLP/opus-mt-ru-en and Helsinki-NLP/opus-mt-zh-en, which I’ve been able to do for a couple years until I upgraded to Ubuntu 24, hence Python 3.12.\n\nDo I need to downgrade to an older version of transformers?\n\nRunning with the example code from the Model Card I get the exception that is frustrating me:\n\n> # Load model directly\n>\n> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM\n>  tokenizer = AutoTokenizer.from_pretrained(“Helsinki-NLP/opus-mt-ru-en”)\n>  model = AutoModelForSeq2SeqLM.from_pretrained(“Helsinki-NLP/opus-mt-ru-en”)\n>  Traceback (most recent call last):\n>  File “/usr/local/lib/python3.12/dist-packages/IPython/core/interactiveshell.py”, line 3747, in run_code\n>  exec(code_obj, self.user_global_ns, self.user_ns)\n>  File “”, line 4, in\n>  tokenizer = AutoTokenizer.from_pretrained(“Helsinki-NLP/opus-mt-ru-en”)\n>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n>  File “/usr/local/lib/python3.12/dist-packages/transformers/models/auto/tokenization_auto.py”, line 789, in from_pretrained\n>  raise ValueError(\n>  ValueError: Unrecognized configuration class <class ‘transformers.models.marian.configuration_marian.MarianConfig’> to build an AutoTokenizer.\n>  Model type should be one of Aimv2Config, AlbertConfig, AlignConfig, AudioFlamingo3Config, AyaVisionConfig, BarkConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BridgeTowerConfig, BrosConfig, CamembertConfig, CanineConfig, ChineseCLIPConfig, ClapConfig, CLIPConfig, CLIPSegConfig, ClvpConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, ColQwen2Config, ConvBertConfig, CpmAntConfig, CTRLConfig, Data2VecAudioConfig, Data2VecTextConfig, DbrxConfig, DebertaConfig, DebertaV2Config, DeepseekVLConfig, DeepseekVLHybridConfig, DiaConfig, DistilBertConfig, DPRConfig, ElectraConfig, Emu3Config, ErnieConfig, EsmConfig, FalconMambaConfig, FastSpeech2ConformerConfig, FlaubertConfig, FlavaConfig, FlexOlmoConfig, Florence2Config, FNetConfig, FSMTConfig, FunnelConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, Gemma3nConfig, Gemma3nTextConfig, GitConfig, GlmConfig, Glm4Config, Glm4MoeConfig, Glm4MoeLiteConfig, Glm4vConfig, Glm4vMoeConfig, GlmImageConfig, GlmAsrConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeHybridConfig, GraniteMoeSharedConfig, GroundingDinoConfig, GroupViTConfig, HubertConfig, IBertConfig, IdeficsConfig, Idefics2Config, InstructBlipConfig, InstructBlipVideoConfig, InternVLConfig, Jais2Config, JambaConfig, JanusConfig, Kosmos2Config, LasrCTCConfig, LasrEncoderConfig, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LayoutXLMConfig, LEDConfig, LightOnOcrConfig, LiltConfig, LlavaConfig, LlavaNextConfig, LongformerConfig, LukeConfig, LxmertConfig, M2M100Config, MambaConfig, Mamba2Config, MarianConfig, MarkupLMConfig, MBartConfig, MegatronBertConfig, MetaClip2Config, MgpstrConfig, MinistralConfig, Ministral3Config, MistralConfig, Mistral3Config, MixtralConfig, MMGroundingDinoConfig, MobileBertConfig, MPNetConfig, MptConfig, MraConfig, MT5Config, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NllbMoeConfig, VisionEncoderDecoderConfig, NystromformerConfig, OlmoConfig, Olmo2Config, Olmo3Config, OlmoHybridConfig, OlmoeConfig, OmDetTurboConfig, OneFormerConfig, OpenAIGPTConfig, OPTConfig, Ovis2Config, Owlv2Config, OwlViTConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, PhiConfig, Phi3Config, Pix2StructConfig, PixtralVisionConfig, PLBartConfig, ProphetNetConfig, Qwen2Config, Qwen2_5OmniConfig, Qwen2_5_VLConfig, Qwen2AudioConfig, Qwen2MoeConfig, Qwen2VLConfig, Qwen3Config, Qwen3_5Config, Qwen3_5MoeConfig, Qwen3MoeConfig, Qwen3NextConfig, Qwen3OmniMoeConfig, Qwen3VLConfig, Qwen3VLMoeConfig, RagConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Sam3Config, Sam3VideoConfig, SeamlessM4TConfig, SeamlessM4Tv2Config, ShieldGemma2Config, SiglipConfig, Siglip2Config, Speech2TextConfig, SpeechT5Config, SplinterConfig, SqueezeBertConfig, StableLmConfig, Starcoder2Config, SwitchTransformersConfig, T5Config, T5GemmaConfig, TapasConfig, TrOCRConfig, TvpConfig, UdopConfig, UMT5Config, UniSpeechConfig, UniSpeechSatConfig, ViltConfig, VipLlavaConfig, VisualBertConfig, VitsConfig, VoxtralConfig, VoxtralRealtimeConfig, Wav2Vec2Config, Wav2Vec2BertConfig, Wav2Vec2ConformerConfig, WhisperConfig, XCLIPConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, xLSTMConfig, XmodConfig, YosoConfig.",
  "title": "ValueError loading Helsinki-NLP tokenizers"
}