Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieectdmrzvkkvaawm37ailgs3nzb6as5jkzm76o6vay2wfvbmq4y4",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkvdhfqbmkr2"
  },
  "path": "/t/the-bpe-pre-tokenizer-was-not-recognized/175714#post_1",
  "publishedAt": "2026-05-02T17:27:43.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "https://github.com/ggml-org/llama.cpp/pull/6920"
  ],
  "textContent": "Hi. After fine-tuning a Qwen3.5-4B, I tried to convert it to gguf using llama.cpp/convert_hf_to_gguf.py but I’m getting the following error. I upgraded transformers to the latest and ran convert_hf_to_gguf_update.py but without success. I would appreciate any guidance. Thanks!\n\nWARNING:hf-to-gguf:**************************************************************************************\n\nWARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!\n\nWARNING:hf-to-gguf:** There are 2 possible reasons for this:\n\nWARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet\n\nWARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream\n\nWARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.\n\nWARNING:hf-to-gguf:** ref: https://github.com/ggml-org/llama.cpp/pull/6920\n\nWARNING:hf-to-gguf:**\n\nWARNING:hf-to-gguf:** chkhsh: 1444df51289cfa8063b96f0e62b1125440111bc79a52003ea14b6eac7016fd5f\n\nWARNING:hf-to-gguf:**************************************************************************************\n\nWARNING:hf-to-gguf:\n\nTraceback (most recent call last):\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 13593, in\n\n\n    main**()**\n\n    \\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 13587, in main\n\n\n    model_instance.write**()**\n\n    \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 934, in write\n\n\n    self.prepare_metadata**(vocab_only=False)**\n\n    \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^^^^^^^^^^^^^^^^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1078, in prepare_metadata\n\n\n    self.set_vocab**()**\n\n    \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1050, in set_vocab\n\n\n    self.\\_set_vocab_gpt2**()**\n\n    \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1567, in _set_vocab_gpt2\n\n\n    tokens, toktypes, tokpre = self.get_vocab_base**()**\n\n                               \\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~\\~**^^**\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1234, in get_vocab_base\n\n\n    tokpre = self.get_vocab_base_pre(tokenizer)\n\n\nFile “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1555, in get_vocab_base_pre\n\n\n    raise NotImplementedError(\"BPE pre-tokenizer was not recognized - update get_vocab_base_pre()\")\n\n\n**NotImplementedError** : BPE pre-tokenizer was not recognized - update get_vocab_base_pre()",
  "title": "The BPE pre-tokenizer was not recognized!"
}