{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicddflb4n7ly6yaieuctayp6gemxfae7roragpzpk567f34nr6idm",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mhw2os6tgri2"
  },
  "path": "/t/using-a-hugging-face-model-offline-to-support-code-generation-in-vscode/174627#post_1",
  "publishedAt": "2026-03-25T20:32:30.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "http://localhost:11434/api/generate"
  ],
  "textContent": "I am trying to use a Hugging Face offline http://localhost:11434/api/generate (spoofing ollama) access model with VSCode. After I get success there, I might try it with openclaw.\n\nI am unable to get VSCode to access the model.\n\nI have tried Continue, LM Studio, CodeGPT and AI Tools in VSCode. I run into a wall of non-functionality, or a demand that I log in via Google or something, I want no logins. With AI Tools, it tried once to access /api/tags, so I looked up what ollama returns for tags, and wrote code to spoof that in my api. I just want VSCode to send a prompt to (spoof ollama interface) wait for the dictionary with a “response” variable, and use it when it comes.\n\nI am a beginner in AI/Python/VSCode, not a beginner in a lot of “old” languages. I have:\n\n  * Downloaded example code for use of Qwen-2.5 Coder 3B offline on my GPU (6GB)\n  * Used the LLM model to help me learn Python to expand my code into a chatbot (copy&paste from chatbot to VSCode, tinker, debug, expand again…)\n  * Used model to learn more Python in two other expansions.\n  * Developed code to run the model as a (spoof ollama) interface, using uvicorn, tested it with curl, and created a /api/chatbot interface too, used it there. I can tell from connections printed to that terminal window for my uvicorn server when there is an attempt to contact the local LLM.\n\n\n\nI want to use Hugging Face, not Ollama. I want completely private sessions, no tokens, no tracking, no telemetry, no logins. I have achieved that with Hugging Face. The model I chose, it just works with my card, I will try others later.\n\nIf VSCode (OpenClaw) is just intentionally incompatible with Hugging Face, fine; a link to an explanation why would be appreciated.\n\nIf this can be made to work, please provide a link to the clearest explanation of how.\n\nThank you",
  "title": "Using a Hugging Face Model offline to support code generation in VSCode"
}