External Publication

The BPE pre-tokenizer was not recognized!

Hugging Face Forums [Unofficial] May 2, 2026

I’d first check the tokenizer files tbh. I don’t think upgrading transformers is the main thing here.

From the traceback, the converter already reaches the vocab/tokenizer part, but llama.cpp does not recognize the pre-tokenizer config from your tokenizer.json.

Can you try converting the original base model with the same llama.cpp commit? If the base model works but your fine-tuned/merged folder fails, then probably something changed in the tokenizer files.

I’d compare tokenizer.json, tokenizer_config.json, special_tokens_map.json, and added tokens. If you didn’t add/change tokens during fine-tuning, try copying the tokenizer files from the base model into the merged folder and run the conversion again.

Also please share the exact base model name, llama.cpp commit, and whether you added any tokens. Without those, it is hard to say much more than guessing from the chkhsh.

Discussion in the ATmosphere