The BPE pre-tokenizer was not recognized!
Hi. After fine-tuning a Qwen3.5-4B, I tried to convert it to gguf using llama.cpp/convert_hf_to_gguf.py but I’m getting the following error. I upgraded transformers to the latest and ran convert_hf_to_gguf_update.py but without success. I would appreciate any guidance. Thanks!
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: https://github.com/ggml-org/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: 1444df51289cfa8063b96f0e62b1125440111bc79a52003ea14b6eac7016fd5f
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:
Traceback (most recent call last):
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 13593, in
main**()**
\~\~\~\~**^^**
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 13587, in main
model_instance.write**()**
\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~**^^**
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 934, in write
self.prepare_metadata**(vocab_only=False)**
\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~**^^^^^^^^^^^^^^^^^^**
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1078, in prepare_metadata
self.set_vocab**()**
\~\~\~\~\~\~\~\~\~\~\~\~\~\~**^^**
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1050, in set_vocab
self.\_set_vocab_gpt2**()**
\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~**^^**
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1567, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base**()**
\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~**^^**
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1234, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File “/Users/admin/ai_tools/./llama.cpp/convert_hf_to_gguf.py”, line 1555, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError : BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
Discussion in the ATmosphere