{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreieectdmrzvkkvaawm37ailgs3nzb6as5jkzm76o6vay2wfvbmq4y4",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkvdhfqbmkr2"
},
"path": "/t/the-bpe-pre-tokenizer-was-not-recognized/175714#post_1",
"publishedAt": "2026-05-02T17:27:43.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"https://github.com/ggml-org/llama.cpp/pull/6920"
],
"textContent": "Hi. After fine-tuning a Qwen3.5-4B, I tried to convert it to gguf using llama.cpp/convert_hf_to_gguf.py but I’m getting the following error. I upgraded transformers to the latest and ran convert_hf_to_gguf_update.py but without success. I would appreciate any guidance. Thanks!\n\nWARNING:hf-to-gguf:**************************************************************************************\n\nWARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!\n\nWARNING:hf-to-gguf:** There are 2 possible reasons for this:\n\nWARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet\n\nWARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream\n\nWARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.\n\nWARNING:hf-to-gguf:** ref: https://github.com/ggml-org/llama.cpp/pull/6920\n\nWARNING:hf-to-gguf:**\n\nWARNING:hf-to-gguf:** chkhsh: 1444df51289cfa8063b96f0e62b1125440111bc79a52003ea14b6eac7016fd5f\n\nWARNING:hf-to-gguf:**************************************************************************************\n\nWARNING:hf-to-gguf:\n\nTraceback (most recent call last):\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 13593, in\n\n\n main**()**\n\n \\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 13587, in main\n\n\n model_instance.write**()**\n\n \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 934, in write\n\n\n self.prepare_metadata**(vocab_only=False)**\n\n \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^^^^^^^^^^^^^^^^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1078, in prepare_metadata\n\n\n self.set_vocab**()**\n\n \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1050, in set_vocab\n\n\n self.\\_set_vocab_gpt2**()**\n\n \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1567, in _set_vocab_gpt2\n\n\n tokens, toktypes, tokpre = self.get_vocab_base**()**\n\n \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1234, in get_vocab_base\n\n\n tokpre = self.get_vocab_base_pre(tokenizer)\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1555, in get_vocab_base_pre\n\n\n raise NotImplementedError(\"BPE pre-tokenizer was not recognized - update get_vocab_base_pre()\")\n\n\n**NotImplementedError** : BPE pre-tokenizer was not recognized - update get_vocab_base_pre()",
"title": "The BPE pre-tokenizer was not recognized!"
}