{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreianiuke7gaqduy3z5e5srzribcj4h2irqdbyakpobxebc33pkejkq",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mi3jbqifpy62"
  },
  "path": "/t/using-a-hugging-face-model-offline-to-support-code-generation-in-vscode/174627#post_5",
  "publishedAt": "2026-03-27T21:59:31.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "So I got it working, sort of, starting with modifications like those above. Continue / VSCode so stuffs the prompt space with general rules, even in chat mode, that my small model, running on a smaller VRAM GPU card, just seized up. Yes, it was taking 2 minutes to respond to a short prompt in a CURL proof of interface, but given the entire prompt provided by VSCode, it just sat there. I inserted print statements to make sure it was getting to various parts of the code, what the entire request was, what the inputs to the model were, and marked the start and end of generation, and it just hung up with no result for some time. I went away, did other things, and came back to VSCode timed out, and my uvicorn server saying it had finished – I did not print the output on the server side, I guess I should, just to make sure it did not completely fail to generate.\n\nI guess I will continue copying and pasting from my offline Chatbot to generate code.\n\nFYI, this is what Continue or VSCode inserted in front of my test input in chat:\n\n\n    Prompt :<important_rules>\n      You are in chat mode.\n\n      If the user asks to make changes to files offer that they can use the Apply Button on the code block, or switch to Agent Mode to make the suggested updates automatically.\n      If needed concisely explain to the user they can switch to agent mode using the Mode Selector dropdown and provide no other details.\n\n      Always include the language and file name in the info string when you write code blocks.\n      If you are editing \"src/main.py\" for example, your code block should start with '```python src/main.py'\n\n      When addressing code modification requests, present a concise code snippet that\n      emphasizes only the necessary changes and uses abbreviated placeholders for\n      unmodified sections. For example:\n\n      ```language /path/to/file\n      // ... existing code ...\n\n      {{ modified code here }}\n\n      // ... existing code ...\n\n      {{ another modification }}\n\n      // ... rest of code ...\n      ```\n\n      In existing files, you should always restate the function or class that the snippet belongs to:\n\n      ```language /path/to/file\n      // ... existing code ...\n\n      function exampleFunction() {\n        // ... existing code ...\n\n        {{ modified code here }}\n\n        // ... rest of function ...\n      }\n\n      // ... rest of code ...\n      ```\n\n      Since users have access to their complete file, they prefer reading only the\n      relevant modifications. It's perfectly acceptable to omit unmodified portions\n      at the beginning, middle, or end of files using these \"lazy\" comments. Only\n      provide the complete file when explicitly requested. Include a concise explanation\n      of changes unless the user specifically asks for code only.\n\n    </important_rules>\n    ```\n\n\n\nThe matter is closed.",
  "title": "Using a Hugging Face Model offline to support code generation in VSCode"
}