External Publication
Visit Post

Ollama model registry provides wrong chat template

Hugging Face Forums [Unofficial] May 20, 2026
Source

I use ollama for running models locally. I noticed that many models behave odd.

Today I went down the rabbit hole to find out why, using bartowski/google_gemma-4-26B-A4B-it-GGUF · Hugging Face as example.

On the HF website, if you open the model details, it will show the correct (complex and lengthy) chat template. Locally I only get a dumbed down version of it:

{{ if .System }}<|turn>system
{{ .System }}<turn|>
{{ end }}{{ if .Prompt }}<|turn>user
{{ .Prompt }}<turn|>
{{ end }}<|turn>model
{{ .Response }}<turn|>

Turns out, this is what HF serves via the ollama model registry:

$ curl -sSf -L -H "Accept: application/vnd.docker.distribution.manifest.v2+json" https://hf.co/v2/bartowski/google_gemma-4-26B-A4B-it-GGUF/manifests/IQ2_XXS | jq
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "digest": "sha256:61b27106ee697324d453c4fdcc4be2e002f1cea930191141d20db1726150ab59",
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 629
  },
  "layers": [
    {
      "digest": "sha256:d516a0bca35cbb83081074bbf58ec2877911111192fbc2c353bf81cd0667b452",
      "mediaType": "application/vnd.ollama.image.model",
      "size": 9656494368
    },
    {
      "digest": "sha256:f56e8459650d8354cf701fa5b0ddaea9a7986a271d7f55677152d1355ab5afb6",
      "mediaType": "application/vnd.ollama.image.template",
      "size": 159
    },
    {
      "digest": "sha256:41cdabd1e8066e983ee6c288eb0117777376223ee0279cadcd67b2295e4d975f",
      "mediaType": "application/vnd.ollama.image.projector",
      "size": 1193058528
    },
    {
      "digest": "sha256:f5107f3ab6b0815958755af9391fb4149e62d2cd3535f3a4ecbd3c3938d47d3e",
      "mediaType": "application/vnd.ollama.image.params",
      "size": 52
    }
  ]
}
$ curl -sSf -L -H "Accept: application/vnd.docker.distribution.manifest.v2+json" https://hf.co/v2/bartowski/google_gemma-4-26B-A4B-it-GGUF/blobs/sha256:f56e8459650d8354cf701fa5b0ddaea9a7986a271d7f55677152d1355ab5afb6
{{ if .System }}<bos><|turn>system
{{ .System }}<turn|>
{{ end }}{{ if .Prompt }}<|turn>user
{{ .Prompt }}<turn|>
{{ end }}<|turn>model
{{ .Response }}<turn|>

When I look into the gguf myself, the correct tokenizer.chat_template is still there.

This happens for multiple large quantizers, so the question is: Is this a configuration error made by the quantizers, e.g. @bartowski, or a general HF issue? The “official” version hosted by Ollama themselves does not seem to have this problem.

This is my first time here, please be gentle. I did research on this topic and didn’t find an answer.

Discussion in the ATmosphere

Loading comments...