Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibim4unn3j5w6uyhomqh4qlhiomrd7rliwujfxyx5rgt67kjyja2q",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmd2wpb2pjf2"
  },
  "path": "/t/ollama-model-registry-provides-wrong-chat-template/176139#post_1",
  "publishedAt": "2026-05-20T21:50:42.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "bartowski/google_gemma-4-26B-A4B-it-GGUF · Hugging Face",
    "@bartowski"
  ],
  "textContent": "I use ollama for running models locally.\nI noticed that many models behave odd.\n\nToday I went down the rabbit hole to find out why, using bartowski/google_gemma-4-26B-A4B-it-GGUF · Hugging Face as example.\n\nOn the HF website, if you open the model details, it will show the correct (complex and lengthy) chat template. Locally I only get a dumbed down version of it:\n\n\n    {{ if .System }}<|turn>system\n    {{ .System }}<turn|>\n    {{ end }}{{ if .Prompt }}<|turn>user\n    {{ .Prompt }}<turn|>\n    {{ end }}<|turn>model\n    {{ .Response }}<turn|>\n\n\nTurns out, this is what HF serves via the ollama model registry:\n\n\n    $ curl -sSf -L -H \"Accept: application/vnd.docker.distribution.manifest.v2+json\" https://hf.co/v2/bartowski/google_gemma-4-26B-A4B-it-GGUF/manifests/IQ2_XXS | jq\n    {\n      \"schemaVersion\": 2,\n      \"mediaType\": \"application/vnd.docker.distribution.manifest.v2+json\",\n      \"config\": {\n        \"digest\": \"sha256:61b27106ee697324d453c4fdcc4be2e002f1cea930191141d20db1726150ab59\",\n        \"mediaType\": \"application/vnd.docker.container.image.v1+json\",\n        \"size\": 629\n      },\n      \"layers\": [\n        {\n          \"digest\": \"sha256:d516a0bca35cbb83081074bbf58ec2877911111192fbc2c353bf81cd0667b452\",\n          \"mediaType\": \"application/vnd.ollama.image.model\",\n          \"size\": 9656494368\n        },\n        {\n          \"digest\": \"sha256:f56e8459650d8354cf701fa5b0ddaea9a7986a271d7f55677152d1355ab5afb6\",\n          \"mediaType\": \"application/vnd.ollama.image.template\",\n          \"size\": 159\n        },\n        {\n          \"digest\": \"sha256:41cdabd1e8066e983ee6c288eb0117777376223ee0279cadcd67b2295e4d975f\",\n          \"mediaType\": \"application/vnd.ollama.image.projector\",\n          \"size\": 1193058528\n        },\n        {\n          \"digest\": \"sha256:f5107f3ab6b0815958755af9391fb4149e62d2cd3535f3a4ecbd3c3938d47d3e\",\n          \"mediaType\": \"application/vnd.ollama.image.params\",\n          \"size\": 52\n        }\n      ]\n    }\n    $ curl -sSf -L -H \"Accept: application/vnd.docker.distribution.manifest.v2+json\" https://hf.co/v2/bartowski/google_gemma-4-26B-A4B-it-GGUF/blobs/sha256:f56e8459650d8354cf701fa5b0ddaea9a7986a271d7f55677152d1355ab5afb6\n    {{ if .System }}<bos><|turn>system\n    {{ .System }}<turn|>\n    {{ end }}{{ if .Prompt }}<|turn>user\n    {{ .Prompt }}<turn|>\n    {{ end }}<|turn>model\n    {{ .Response }}<turn|>\n\n\nWhen I look into the gguf myself, the correct `tokenizer.chat_template` is still there.\n\nThis happens for multiple large quantizers, so the question is:\nIs this a configuration error made by the quantizers, e.g. @bartowski, or a general HF issue?\nThe “official” version hosted by Ollama themselves does not seem to have this problem.\n\nThis is my first time here, please be gentle. I did research on this topic and didn’t find an answer.",
  "title": "Ollama model registry provides wrong chat template"
}