{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreib5dgrqwmwoyjo53kzmi66yh47zv7g5eadakiknlsgoxuqoco5dwy",
    "uri": "at://did:plc:fuaxi56ej27ymlesklypt3ar/app.bsky.feed.post/3meaqgnmgaik2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreibpaysdj44qzjl3zxmnu4pukby4i4cohc7zbvjrts4xwtszmjkemy"
    },
    "mimeType": "image/png",
    "size": 20142
  },
  "description": "OLLAMA_CONTEXT_LENGTH environment variable didn’t have an effect, but there’s another way",
  "path": "/increase-ollama-context-length-num-ctx/",
  "publishedAt": "2026-02-05T23:59:00.000Z",
  "site": "https://www.autodidacts.io",
  "tags": [
    "100DaysToOffload",
    "View more posts in this series.",
    "here",
    "Ollama’s docs",
    "in a comment",
    "Ollama recommends 64000 for agents etc"
  ],
  "textContent": "****Note:**** this post is part of #100DaysToOffload, a challenge to publish 100 posts in 365 days. These posts are generally shorter and less polished than our normal posts; expect typos and unfiltered thoughts! View more posts in this series.\n\n\n\n\nI was trying out `glm-ocr`, and discovered, that though it has performance close to Qwen3-VL or deepseek-ocr, while requiring less resources, it produces empty output with Ollama’s (tiny, 4096) default model context size.\n\nDiscussion here pointed me in the right direction.\n\nAccording to Ollama’s docs, you can set the context length with the `OLLAMA_CONTEXT_LENGTH` environment variable.\n\nI tried it, both by exporting the variable and restarting the Ollama service (`sudo service ollama restart`), and by passing it directly to the Ollama run command. No luck!\n\nRather than debug what was going wrong, I found a workaround.\n\nIt was simple to set the context length from the REPL that starts when you start a session with `ollama run glm-ocr`, with no prompt:\n\n\n    /set parameter num_ctx 10240\n\n\nBut I wasn’t running `glm-ocr` from the REPL, I was running it from the CLI. And `/set` doesn’t persist once you exit the REPL.\n\nI found the answer I needed in a comment on the r/LocalLLaMA subreddit.\n\nSet the context, as above, with:\n\n\n    /set parameter num_ctx 10240\n\n\n_Then_ , save a copy of the model with the current parameters as the default settings:\n\n\n    /save glm-ocr-10k\n\n\nNow, I can use it on the CLI by using the new model name:\n\n\n    ollama run glm-ocr-10k \"Text Recognition: ./image.jpg\"\n\n\n**What values work well?**\n\nSince Ollama silently truncates context, it’s hard to know what’s the right value to use. Set it too high, and it will max out your resources. Ollama recommends 64000 for agents etc, but this won’t run on an older laptop.\n\n  * The default (4096) produces no output with `glm-ocr`, just empty markdown and text code fences.\n  * 10240 produces output (with errors)\n  * I’m currently trying 20480. It has the same errors as 10240, but is pretty good; I don’t know whether the errors relate to the context size or not.\n  * 64000 requires > 16gb RAM.\n\n\n\n**Why not just downscale the images?**\n\nThat’s what I’m going to try next.",
  "title": "How to increase Ollama context length",
  "updatedAt": "2026-02-05T23:59:00.000Z"
}