Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihxcraqgdax6xpmqswrrckzsbdvsc32mqdhtcz5j3s4wiprl37sca",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mgc7uqargtf2"
  },
  "path": "/t/need-help-getting-started-with-image-generation/174012#post_2",
  "publishedAt": "2026-03-05T05:19:11.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "While things have improved significantly on Linux and Windows 11 + WSL2 environments today, options remain quite limited on Windows 10",
    "GitHub",
    "AMD",
    "Hugging Face",
    "ONNX Runtime",
    "Ollama Official Document"
  ],
  "textContent": "When using open-source generative AI models, there are still some limitations with AMD GPUs. While things have improved significantly on Linux and Windows 11 + WSL2 environments today, options remain quite limited on Windows 10…\n\n* * *\n\n## What you’re setting up (two separate local apps)\n\n  * **Image generation:** Stable Diffusion 1.5 “weights” + a **GUI** that runs locally (you open it in your browser at `127.0.0.1`).\n  * **Prompt enhancement:** a small local text model that turns “an idea” into **POSITIVE / NEGATIVE / SETTINGS** you copy/paste into the image GUI.\n\n\n\nKeeping them separate is the simplest “offline + no-coding” workflow.\n\n* * *\n\n## The most realistic Windows 10 + AMD path (no WSL2)\n\n### Best first-success route\n\n**SD.Next + ONNX Runtime + DirectML (DmlExecutionProvider)**\nSD.Next explicitly supports ONNX Runtime and notes you can select **DmlExecutionProvider** by installing `onnxruntime-directml`, and that **DirectX 12 is required**. (GitHub)\n\n### Alternatives (only if you want them later)\n\n  * **AUTOMATIC1111 + Microsoft DirectML extension:** uses ONNX Runtime + DirectML, but requires models optimized via **Olive** (more moving parts). (GitHub)\nAMD’s own guide for that extension calls it “preview” and (in that guide) states **only SD 1.5 is supported**. (AMD)\n  * **A1111 main repo on Windows+AMD:** not officially supported; their wiki points to DirectML-focused forks/approaches instead. (GitHub)\n  * **SD.Next + ZLUDA:** can be a speed/compatibility upgrade on some AMD cards, but it’s an “after you already work” option. SD.Next documents launching it with `--use-zluda` and notes HIP SDK version constraints. (GitHub)\n\n\n\n* * *\n\n## Step-by-step: SD 1.5 image generation with SD.Next (Windows 10 + AMD)\n\n### 0) Put it in an easy folder\n\nUse something like:\n\n  * `C:\\AI\\sdnext\\`\n\n\n\nAvoid OneDrive/Desktop/Program Files. (This prevents many permissions/path problems.)\n\n### 1) Install the basics (one-time)\n\n  * Latest AMD GPU driver + reboot\n  * Git for Windows\n  * Python (many SD Windows setups are happiest on Python 3.10.x)\n\n\n\n### 2) Install + start SD.Next (use **cmd.exe** , not PowerShell)\n\nOpen **Command Prompt** and run:\n\n\n    cd C:\\AI\n    git clone https://github.com/vladmandic/sdnext.git\n    cd sdnext\n    webui.bat --debug\n\n\nSD.Next documents launching on Windows with `webui.bat --debug`. (GitHub)\n\nWhen it finishes starting, it prints a local URL (often `http://127.0.0.1:7860`). Open that in your browser.\n\n### 3) Add an SD 1.5 model file (the “weights”)\n\nA common starter SD 1.5 checkpoint is:\n\n  * `v1-5-pruned-emaonly.safetensors` (license shown as **creativeml-openrail-m**) (Hugging Face)\n\n\n\nPlace the `.safetensors` file into SD.Next’s model folder (SD.Next “Getting Started” covers the basic “generate with a few clicks” workflow and model handling). (GitHub)\n\n### 4) Turn on AMD GPU acceleration (ONNX Runtime + DirectML)\n\nIn SD.Next, switch to the ONNX Runtime pipeline and choose **DmlExecutionProvider** (DirectML). SD.Next notes:\n\n  * DML EP becomes available by installing `onnxruntime-directml`\n  * DirectX 12 is required (GitHub)\n\n\n\nWhy this matters: ONNX Runtime’s DirectML EP has specific constraints (for example, it does **not** support memory-pattern optimizations or parallel execution in ORT sessions). (ONNX Runtime)\n\n### 5) First “known-stable” test settings (prove it works)\n\nStart conservative:\n\n  * **512×512**\n  * **Steps:** 20\n  * **CFG:** ~7\n  * **Batch size:** 1\n\n\n\nTest prompts:\n\n  * Positive: `portrait photo, soft studio lighting, sharp focus`\n  * Negative: `lowres, blurry, watermark, text, bad anatomy, extra fingers`\n\n\n\nOnce you can generate one image reliably, then raise resolution/complexity.\n\n* * *\n\n## Quick troubleshooting (the fastest fixes)\n\n### A) Start in “safe mode” to remove extension problems\n\n\n    webui.bat --debug --safe\n\n\n`--safe` disables user extensions and is recommended for troubleshooting. (GitHub)\n\n### B) UI acts broken / buttons don’t work\n\nSD.Next recommends deleting `ui-config.json` if it’s bloated (old settings can override new defaults and break the UI). (GitHub)\n\n### C) DirectML crashes / weird ORT errors\n\nDirectML EP requires certain ORT options (mem-pattern + parallel execution) to be disabled; enabling them can cause errors. (ONNX Runtime)\nIf you see errors like `80070057`, they’re commonly associated with those constraints; ONNX Runtime has issue reports in this area. (GitHub)\n\n* * *\n\n## Prompt enhancement (offline, GUI-first)\n\n### Pick one “local chat” app\n\n#### Option 1: Jan (desktop GUI, open source, offline)\n\nJan is presented as an open-source ChatGPT-like app for running models locally. (GitHub)\n\n#### Option 2: KoboldCpp (single EXE + browser UI; good AMD hint)\n\nKoboldCpp releases explicitly recommend **the Vulkan option in the nocuda build** for AMD. (GitHub)\n\n#### Option 3: Ollama (simple installer)\n\nOllama’s Windows docs state it **does not require Administrator** and installs in your home directory by default. (Ollama Official Document)\n\n### Good beginner prompt-enhancer models (small + practical)\n\n**Specialized prompt optimizers (often best for SD prompting):**\n\n  * **TIPO-200M** (prompt optimization for text-to-image workflows). (Hugging Face)\n  * **DART v2** (generates Danbooru-style tags; useful if you like tag prompts). (Hugging Face)\n\n\n\n**General small instruct model (good at structured output):**\n\n  * **SmolLM2-1.7B-Instruct** (compact “run on-device” class model). (Hugging Face)\n\n\n\n### Copy/paste template for your prompt enhancer\n\nUse this once as your “system prompt” (or first message).\n\n\n    You write prompts for Stable Diffusion 1.5.\n\n    Return exactly these sections:\n\n    POSITIVE:\n    NEGATIVE:\n    SETTINGS:\n    VARIATIONS:\n\n    Rules:\n    - POSITIVE: 1–2 lines. Include subject, environment, lighting, camera/framing, style/medium.\n    - NEGATIVE: comma-separated. Include common artifacts: lowres, blurry, watermark, text, deformed hands, extra fingers.\n    - SETTINGS: suggest resolution (start 512x512), steps (20–30), CFG (6–8).\n    - VARIATIONS: 5 short alternate POSITIVE prompts that keep the same idea but change lighting/camera/mood.\n\n    User idea: <paste your idea here>\n\n\nWorkflow:\n\n  1. Write your idea → 2) copy POSITIVE/NEGATIVE/SETTINGS → 3) paste into SD.Next → 4) generate.\n\n",
  "title": "Need help getting started with image generation"
}