{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreibim4unn3j5w6uyhomqh4qlhiomrd7rliwujfxyx5rgt67kjyja2q",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmd2wpb2pjf2"
},
"path": "/t/ollama-model-registry-provides-wrong-chat-template/176139#post_1",
"publishedAt": "2026-05-20T21:50:42.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"bartowski/google_gemma-4-26B-A4B-it-GGUF · Hugging Face",
"@bartowski"
],
"textContent": "I use ollama for running models locally.\nI noticed that many models behave odd.\n\nToday I went down the rabbit hole to find out why, using bartowski/google_gemma-4-26B-A4B-it-GGUF · Hugging Face as example.\n\nOn the HF website, if you open the model details, it will show the correct (complex and lengthy) chat template. Locally I only get a dumbed down version of it:\n\n\n {{ if .System }}<|turn>system\n {{ .System }}<turn|>\n {{ end }}{{ if .Prompt }}<|turn>user\n {{ .Prompt }}<turn|>\n {{ end }}<|turn>model\n {{ .Response }}<turn|>\n\n\nTurns out, this is what HF serves via the ollama model registry:\n\n\n $ curl -sSf -L -H \"Accept: application/vnd.docker.distribution.manifest.v2+json\" https://hf.co/v2/bartowski/google_gemma-4-26B-A4B-it-GGUF/manifests/IQ2_XXS | jq\n {\n \"schemaVersion\": 2,\n \"mediaType\": \"application/vnd.docker.distribution.manifest.v2+json\",\n \"config\": {\n \"digest\": \"sha256:61b27106ee697324d453c4fdcc4be2e002f1cea930191141d20db1726150ab59\",\n \"mediaType\": \"application/vnd.docker.container.image.v1+json\",\n \"size\": 629\n },\n \"layers\": [\n {\n \"digest\": \"sha256:d516a0bca35cbb83081074bbf58ec2877911111192fbc2c353bf81cd0667b452\",\n \"mediaType\": \"application/vnd.ollama.image.model\",\n \"size\": 9656494368\n },\n {\n \"digest\": \"sha256:f56e8459650d8354cf701fa5b0ddaea9a7986a271d7f55677152d1355ab5afb6\",\n \"mediaType\": \"application/vnd.ollama.image.template\",\n \"size\": 159\n },\n {\n \"digest\": \"sha256:41cdabd1e8066e983ee6c288eb0117777376223ee0279cadcd67b2295e4d975f\",\n \"mediaType\": \"application/vnd.ollama.image.projector\",\n \"size\": 1193058528\n },\n {\n \"digest\": \"sha256:f5107f3ab6b0815958755af9391fb4149e62d2cd3535f3a4ecbd3c3938d47d3e\",\n \"mediaType\": \"application/vnd.ollama.image.params\",\n \"size\": 52\n }\n ]\n }\n $ curl -sSf -L -H \"Accept: application/vnd.docker.distribution.manifest.v2+json\" https://hf.co/v2/bartowski/google_gemma-4-26B-A4B-it-GGUF/blobs/sha256:f56e8459650d8354cf701fa5b0ddaea9a7986a271d7f55677152d1355ab5afb6\n {{ if .System }}<bos><|turn>system\n {{ .System }}<turn|>\n {{ end }}{{ if .Prompt }}<|turn>user\n {{ .Prompt }}<turn|>\n {{ end }}<|turn>model\n {{ .Response }}<turn|>\n\n\nWhen I look into the gguf myself, the correct `tokenizer.chat_template` is still there.\n\nThis happens for multiple large quantizers, so the question is:\nIs this a configuration error made by the quantizers, e.g. @bartowski, or a general HF issue?\nThe “official” version hosted by Ollama themselves does not seem to have this problem.\n\nThis is my first time here, please be gentle. I did research on this topic and didn’t find an answer.",
"title": "Ollama model registry provides wrong chat template"
}