Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihx6r6s27mu3g2mc6vlegkozy6jbsgjqpuulqyg6a2gzdfml6x5km",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mliqhupwcvj2"
  },
  "path": "/t/wan2-2-i2v-clarifications-needed-regarding-settings-on-low-vram-system/175884#post_6",
  "publishedAt": "2026-05-10T12:05:09.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "WanMoeKSampler README",
    "Wan2.2 I2V-A14B model card",
    "LightX2V Wan2.2 I2V model card",
    "LightX2V Wan2.2 I2V card",
    "ComfyUI Wan2.2 docs",
    "ComfyUI Wan2.2 examples",
    "QuantStack Wan2.2 I2V-A14B GGUF page",
    "city96 UMT5 XXL encoder GGUF page",
    "Wan2.2 official GitHub",
    "ComfyUI official Wan2.2 workflow guide",
    "ComfyUI-GGUF",
    "QuantStack Wan2.2 I2V-A14B GGUF",
    "city96 UMT5 XXL encoder GGUF",
    "LightX2V Wan2.2 I2V A14B MoE Distill",
    "Wan2.2-Lightning GitHub"
  ],
  "textContent": "To put it simply, there’s a suspicion that settings intended for a different model have gotten mixed in?:\n\n* * *\n\n# Wan2.2 I2V-A14B High/Low UNets: blur, crosshatching, yellow flash — likely causes and clean baseline\n\nLooking at the workflow screenshot, the problem is probably **not mainly the prompt**. It looks more like a **sampling schedule / High-Low boundary / step count / VAE / distilled-vs-normal workflow mismatch** problem.\n\nThe suspicious settings in the screenshot are:\n\n\n    HighNoise GGUF -> ModelSamplingSD3 shift 5.00 -> WanMoeKSampler model_high_noise\n    LowNoise GGUF  -> ModelSamplingSD3 shift 5.00 -> WanMoeKSampler model_low_noise\n\n    WanMoeKSampler:\n      boundary: 0.750\n      add_noise: enable\n      steps: 6\n      cfg_high_noise: 1.5\n      cfg_low_noise: 2.0\n      sampler_name: euler\n      scheduler: simple\n      sigma_shift: 4.00\n      return_with_leftover_noise: disable\n\n\nThe short version:\n\n> The screenshot looks like a hybrid between a normal Wan2.2 High/Low UNet workflow and a 4-step Lightning/LightX2V-style workflow. That hybrid zone can easily cause heavy blur, side blur, crosshatching texture, and yellow lighting flashes.\n\n* * *\n\n## 1. Biggest issue: `boundary = 0.750`\n\nFor Wan2.2 I2V, `boundary = 0.750` is the first thing I would change.\n\nThe WanMoeKSampler README says the Wan2.2 boundary is around:\n\n\n    Wan2.2 T2V: 0.875\n    Wan2.2 I2V: 0.900\n\n\nIt also explains that this boundary is a **diffusion timestep** , not a denoising step. The actual switch step depends on total steps, sampler, scheduler, and sigma shift.\n\nSo for Wan2.2 **I2V** , reset this:\n\n\n    boundary: 0.750\n\n\nto this:\n\n\n    boundary: 0.900\n\n\n### Why this matters\n\nWan2.2 A14B uses separate denoising experts:\n\nExpert | Main job\n---|---\n**High-noise expert** | early structure, broad layout, motion, pose, composition\n**Low-noise expert** | later detail, face, eyes, mouth, skin, color, texture, final sharpness\n\nThe Wan2.2 I2V-A14B model card describes this High-noise / Low-noise MoE design and the idea that the experts specialize in different denoising stages.\n\nIf the boundary is too low, the High-noise model can stay active too long and the Low-noise model may not get enough useful refinement time.\n\nThat can look like:\n\n\n    first frames look okay\n    then the clip turns blurry\n    center improves but sides remain mushy\n    fine texture looks scratchy/crosshatched\n    lighting becomes unstable\n    faces fail to refine\n\n\nSo the first clean correction is:\n\n\n    boundary: 0.900\n\n\n* * *\n\n## 2. Second issue: `steps = 6` is too low for judging normal High/Low UNets\n\nSix steps is very low for the **normal** Wan2.2 I2V-A14B High/Low model pair.\n\nIt can be useful as a quick smoke test, but it is not a fair quality test unless you are using a proper distilled / Lightning / LightX2V setup.\n\nFor the normal High/Low UNets, I would test:\n\n\n    steps: 12\n\n\nIf that is too slow on 8GB VRAM, use this only as a compromise:\n\n\n    steps: 8\n\n\nBut I would not judge the normal High/Low pair from 6 steps. At 6 steps, the Low-noise expert may simply not have enough time to resolve detail.\n\nSymptoms of too few steps:\n\n\n    crosshatching texture\n    unfinished skin/detail\n    soft edges\n    side blur\n    poor face detail\n    color flicker\n    lighting pulses\n\n\n* * *\n\n## 3. Third issue: you may be applying shift twice\n\nThe screenshot shows:\n\n\n    ModelSamplingSD3 shift: 5.00\n\n\nbefore both models, plus:\n\n\n    WanMoeKSampler sigma_shift: 4.00\n\n\ninside the WanMoeKSampler.\n\nWhile debugging, that is too ambiguous. Use **one source of shift only**.\n\n### Recommended cleanup\n\nFor the first stable baseline, remove the two `ModelSamplingSD3` nodes:\n\n\n    HighNoise GGUF -> WanMoeKSampler model_high_noise\n    LowNoise GGUF  -> WanMoeKSampler model_low_noise\n\n\nThen set this inside WanMoeKSampler:\n\n\n    sigma_shift: 5.0\n\n\nThis gives you one clear place controlling the shift.\n\nWhy `5.0`? The LightX2V Wan2.2 I2V model card recommends Euler with:\n\n\n    shift: 5.0\n    guidance_scale: 1.0\n\n\nfor its distilled branch. More importantly, `5.0` is also a sane first test value when cleaning up the graph.\n\nThe key point is:\n\n> Do not run `ModelSamplingSD3 shift 5` plus `WanMoeKSampler sigma_shift 4` while trying to diagnose artifacts.\n\nAfter you get a stable baseline, you can test whether the external `ModelSamplingSD3` nodes help. But they should not be part of the first diagnosis pass.\n\n* * *\n\n## 4. Fourth issue: CFG values are in the wrong middle zone\n\nThe screenshot uses:\n\n\n    cfg_high_noise: 1.5\n    cfg_low_noise: 2.0\n\n\nThat is neither a strict Lightning/LightX2V recipe nor a normal High/Low baseline.\n\nYou need to decide which branch you are testing.\n\n* * *\n\n# Branch A — normal Wan2.2 I2V-A14B High/Low UNets\n\nUse this branch if you are loading the normal HighNoise and LowNoise GGUFs without Lightning/LightX2V LoRAs.\n\nIn this branch, CFG 1.0 is usually too weak. CFG 1.0 is mostly a Rapid/Lightning/distilled habit, not a universal Wan2.2 setting.\n\nRecommended baseline:\n\n\n    High model:\n      Wan2.2 I2V-A14B HighNoise GGUF\n\n    Low model:\n      Wan2.2 I2V-A14B LowNoise GGUF\n\n    Remove:\n      ModelSamplingSD3 nodes before WanMoeKSampler\n\n    WanMoeKSampler:\n      boundary: 0.900\n      add_noise: enable\n      steps: 12\n      cfg_high_noise: 3.0\n      cfg_low_noise: 3.0\n      sampler_name: euler\n      scheduler: simple\n      sigma_shift: 5.0\n      start_at_step: 0\n      end_at_step: 10000\n      return_with_leftover_noise: disable\n\n    VAE:\n      wan_2.1_vae.safetensors\n\n    Test size:\n      33 frames\n      512-640px long side\n      fixed seed\n\n    Disable during baseline:\n      LoRAs\n      upscalers\n      interpolation\n      face restore\n      post-sharpening\n      color correction\n\n\nIf 12 steps is too slow:\n\n\n    boundary: 0.900\n    steps: 8\n    cfg_high_noise: 3.0\n    cfg_low_noise: 3.0\n    sampler_name: euler\n    scheduler: simple\n    sigma_shift: 5.0\n\n\nBut treat 8 steps as a sanity test, not a final quality test.\n\n* * *\n\n# Branch B — Lightning / LightX2V / distilled 4-step branch\n\nUse this branch only if you are using matching Lightning/LightX2V I2V LoRAs or a proper distilled LightX2V setup.\n\nThe LightX2V Wan2.2 I2V card recommends:\n\n\n    Euler scheduler\n    shift: 5.0\n    guidance_scale: 1.0\n\n\nIt describes this as running without CFG. The README also says the distilled model is built for substantially fewer inference steps, specifically 4-step-style use.\n\nStrict distilled baseline:\n\n\n    High model:\n      compatible Wan2.2 I2V-A14B HighNoise model\n\n    Low model:\n      compatible Wan2.2 I2V-A14B LowNoise model\n\n    LoRAs:\n      matching I2V Lightning/LightX2V High LoRA\n      matching I2V Lightning/LightX2V Low LoRA\n      strength: 1.0 each\n\n    Remove:\n      external ModelSamplingSD3 nodes during baseline\n\n    WanMoeKSampler:\n      boundary: 0.900\n      add_noise: enable\n      steps: 4\n      cfg_high_noise: 1.0\n      cfg_low_noise: 1.0\n      sampler_name: euler\n      scheduler: simple\n      sigma_shift: 5.0\n      start_at_step: 0\n      end_at_step: 10000\n      return_with_leftover_noise: disable\n\n    VAE:\n      wan_2.1_vae.safetensors\n\n    Test size:\n      33 frames\n      512-640px long side\n      fixed seed\n\n\nDo not mix this with the normal branch.\n\nBad hybrid zone:\n\n\n    normal High/Low GGUFs\n    + no matching distilled LoRAs\n    + 6 steps\n    + CFG around 1-2\n    + boundary 0.750\n    + external shift 5\n    + internal sigma_shift 4\n\n\nThat is exactly the kind of setup that can produce blur, crosshatching, and flashing.\n\n* * *\n\n# 5. VAE check: very important\n\nFor Wan2.2 14B I2V, check that you are using:\n\n\n    wan_2.1_vae.safetensors\n\n\nThe ComfyUI Wan2.2 docs and ComfyUI Wan2.2 examples point to `wan_2.1_vae.safetensors` for the 14B workflows.\n\nA wrong or mismatched VAE can look like:\n\n\n    soft decode\n    general haze\n    yellow/red color cast\n    skin tone shift\n    center glow\n    lighting flash\n    poor reconstruction\n    blurred details\n\n\nDo not try to fix a VAE mismatch with prompts like “no yellow light.” Fix the VAE first.\n\n* * *\n\n# 6. Artifact-by-artifact diagnosis\n\n## A. “First 3 frames normal, then 90% blur”\n\nMost likely causes:\n\n\n    boundary too low\n    too few total steps\n    Low-noise expert starts too late\n    shift schedule conflict\n    wrong VAE\n    normal UNets being run like a distilled 4-step model\n\n\nFix order:\n\n\n    1. boundary: 0.900\n    2. remove external ModelSamplingSD3 nodes\n    3. sigma_shift: 5.0 inside WanMoeKSampler\n    4. VAE: wan_2.1_vae.safetensors\n    5. normal branch: steps 12, CFG 3.0 / 3.0\n    6. distilled branch: steps 4, CFG 1.0 / 1.0, matching LoRAs only\n\n\n* * *\n\n## B. “Center improved but sides are still blurry”\n\nLikely causes:\n\n\n    not enough Low-noise refinement\n    bad High/Low boundary\n    low step count\n    resolution/aspect stress\n    VAE softness\n    quantization/offload instability\n    post-processing or resize issue\n\n\nTry:\n\n\n    33 frames only\n    512-640px long side\n    boundary 0.900\n    steps 12 if normal branch\n    correct VAE\n    no post nodes\n    no upscaler\n    no interpolation\n    no face restore\n\n\nAlso use clean dimensions. Examples:\n\n\n    512x288\n    576x320\n    640x360\n    640x384\n    384x640 for portrait\n\n\nAvoid large or odd dimensions while debugging.\n\n* * *\n\n## C. “Crosshatching texture, not square pixelation”\n\nThat usually means incomplete or unstable denoising, not classic video compression.\n\nMost likely causes:\n\n\n    6 steps is too low\n    boundary is wrong\n    GGUF quantization is stressed\n    shift schedule is confused\n    Low-noise refinement is underpowered\n    VAE decode is wrong or mismatched\n\n\nThe QuantStack Wan2.2 I2V-A14B GGUF page lists approximate quant sizes such as:\n\n\n    Q2_K:    5.3 GB\n    Q3_K_S:  6.52 GB\n    Q3_K_M:  7.18 GB\n    Q4_K_S:  8.75 GB\n    Q4_K_M:  9.65 GB\n    Q5_K_S: 10.1 GB\n    Q5_K_M: 10.8 GB\n    Q6_K:   12 GB\n    Q8_0:   15.4 GB\n\n\nOn an 8GB laptop GPU, Q4_K_M can be theoretically better but practically worse if it causes too much offloading, swapping, or instability.\n\nLow-VRAM quant tests:\n\n\n    Test A:\n      High: Q3_K_M\n      Low:  Q3_K_M\n\n    Test B:\n      High: Q3_K_M\n      Low:  Q4_K_S\n\n    Test C:\n      High: Q4_K_S\n      Low:  Q4_K_S\n\n\nFor face/detail fidelity, the most interesting test is:\n\n\n    High: Q3_K_M\n    Low:  Q4_K_S\n\n\nReason: the Low-noise model is the detail finisher.\n\n* * *\n\n## D. “Strong yellow light flashing in the middle”\n\nThis is probably not a prompt issue.\n\nLikely causes:\n\n\n    wrong VAE\n    double shift / schedule conflict\n    LightX2V LoRA trajectory mismatch\n    normal High/Low UNets using distilled settings\n    too few steps\n    bad High/Low boundary\n    quantization + low-step instability\n\n\nFix order:\n\n\n    1. confirm VAE = wan_2.1_vae.safetensors\n    2. remove external ModelSamplingSD3 nodes\n    3. boundary = 0.900\n    4. sigma_shift = 5.0\n    5. normal branch: 12 steps, CFG 3.0 / 3.0\n    6. distilled branch: 4 steps, CFG 1.0 / 1.0, matching LoRAs only\n    7. disable upscaler/interpolation/face restore\n    8. test 33 frames at 512-640px long side\n\n\nA negative prompt can include `yellow flash`, but if the denoising path or VAE is wrong, the prompt will not reliably fix it.\n\n* * *\n\n# 7. What to check in the console\n\nCheck where WanMoeKSampler actually switches from High-noise to Low-noise.\n\nLook for something equivalent to:\n\n\n    switching model at step X\n\n\nDo not reason from `boundary` alone. The WanMoeKSampler README explains that diffusion timestep is not the same thing as denoising step.\n\nFor a 4-step distilled branch, you generally want something close to:\n\n\n    High: 2 steps\n    Low:  2 steps\n\n\nFor a normal 12-step branch, you want enough Low-noise steps left to refine detail. If Low-noise only gets a tiny part of the run, blur and poor texture are expected.\n\n* * *\n\n# 8. Text encoder check\n\nIf prompt obedience is weak, do not only raise CFG. Text encoder quantization can matter.\n\nThe city96 UMT5 XXL encoder GGUF page says Q5_K_M or larger is recommended for best results, while smaller models can still be acceptable in constrained setups.\n\nApproximate sizes listed there include:\n\n\n    Q3_K_M: about 3.06 GB\n    Q4_K_M: about 3.66 GB\n    Q5_K_M: about 4.15 GB\n    Q8_0:   about 6.04 GB\n    F16:    about 11.4 GB\n\n\nFor an 8GB GPU setup:\n\n\n    UMT5 Q3_K_M:\n      safest memory option\n\n    UMT5 Q4_K_M:\n      good low-VRAM baseline\n\n    UMT5 Q5_K_M:\n      better prompt understanding if system RAM/offload behavior allows it\n\n\nWeak prompt obedience may be text-encoder-related, not just CFG-related.\n\n* * *\n\n# 9. Suggested prompt while debugging\n\nUse a boring source-faithful prompt. Do not use cinematic lighting while debugging a yellow lighting artifact.\n\n## Positive\n\n\n    A realistic image-to-video animation of the person in the source image. Preserve the exact same face, identity, hairstyle, clothing, colors, lighting, and background. The person makes only very subtle natural movement: slight breathing, a small blink, and minimal head movement. Static camera. No zoom. No scene change. Natural colors. Sharp facial details.\n\n\n## Negative\n\n\n    different person, face change, identity change, distorted face, warped eyes, asymmetrical eyes, deformed mouth, changing hairstyle, changing clothes, changing background, camera movement, zoom, scene change, fantasy, sci-fi, anime, painting, overexposed, oversaturated, red tint, yellow flash, blurry, low detail, melted face, extra teeth, crosshatching, noisy texture\n\n\nAt CFG 1.0, the negative prompt may have little practical effect. It should matter more in the normal branch at CFG around 3.0.\n\n* * *\n\n# 10. Minimal troubleshooting plan\n\nRun these in order. Change only one branch at a time.\n\n## Test 1 — normal High/Low sanity test\n\n\n    Remove:\n      both ModelSamplingSD3 nodes\n\n    WanMoeKSampler:\n      boundary: 0.900\n      add_noise: enable\n      steps: 12\n      cfg_high_noise: 3.0\n      cfg_low_noise: 3.0\n      sampler_name: euler\n      scheduler: simple\n      sigma_shift: 5.0\n      start_at_step: 0\n      end_at_step: 10000\n      return_with_leftover_noise: disable\n\n    VAE:\n      wan_2.1_vae.safetensors\n\n    Video:\n      33 frames\n      512-640px long side\n      fixed seed\n\n    Disable:\n      LoRAs\n      upscaler\n      interpolation\n      face restore\n      postprocessing\n\n\nIf this improves blur/crosshatching/yellow flash, the previous issue was probably:\n\n\n    boundary too low\n    too few steps\n    CFG too low for normal branch\n    shift conflict\n    VAE mismatch\n\n\n* * *\n\n## Test 2 — cheaper normal-branch sanity test\n\nIf 12 steps is too slow:\n\n\n    Same as Test 1, but:\n\n    steps: 8\n\n\nIf 8 looks bad but 12 improves, the issue is mainly under-refinement.\n\n* * *\n\n## Test 3 — strict Lightning/LightX2V branch\n\nOnly use this if you are using matching I2V Lightning/LightX2V LoRAs or a proper distilled LightX2V setup.\n\n\n    Use:\n      matching I2V Lightning/LightX2V LoRAs\n      LoRA strength: 1.0 each\n\n    Remove:\n      both ModelSamplingSD3 nodes\n\n    WanMoeKSampler:\n      boundary: 0.900\n      add_noise: enable\n      steps: 4\n      cfg_high_noise: 1.0\n      cfg_low_noise: 1.0\n      sampler_name: euler\n      scheduler: simple\n      sigma_shift: 5.0\n      start_at_step: 0\n      end_at_step: 10000\n      return_with_leftover_noise: disable\n\n    VAE:\n      wan_2.1_vae.safetensors\n\n    Video:\n      33 frames\n      512-640px long side\n\n\nIf this still has yellow flashing, suspect:\n\n\n    wrong LoRA pair\n    T2V LoRA used in I2V\n    High/Low LoRAs mismatched\n    wrong VAE\n    wrong model pair\n    double shift\n    workflow node mismatch\n\n\n* * *\n\n# 11. Recommended settings table\n\nScenario | Boundary | Steps | CFG high | CFG low | Sampler | Scheduler | Shift | Notes\n---|---|---|---|---|---|---|---|---\n**Normal High/Low sanity baseline** | 0.900 | 12 | 3.0 | 3.0 | Euler | Simple | 5.0 | Best next test\n**Normal low-cost test** | 0.900 | 8 | 3.0 | 3.0 | Euler | Simple | 5.0 | Debug only\n**Strict Lightning/LightX2V** | 0.900 | 4 | 1.0 | 1.0 | Euler | Simple | 5.0 | Only with matching distilled LoRAs/model\n**Current screenshot** | 0.750 | 6 | 1.5 | 2.0 | Euler | Simple | 4.0 + external 5.0 | Likely unstable hybrid\n\n* * *\n\n# 12. Things I would avoid right now\n\nAvoid this while debugging:\n\n\n    boundary 0.750\n    steps 6\n    CFG 1-ish on normal High/Low UNets\n    external ModelSamplingSD3 shift + WanMoe sigma_shift\n    720p\n    49+ frames before baseline works\n    upscale\n    interpolation\n    face restore\n    multiple LoRAs\n    cinematic lighting prompts\n    large camera movement\n\n\nEspecially avoid judging the workflow from this combination:\n\n\n    8GB VRAM\n    normal High/Low GGUFs\n    boundary 0.750\n    steps 6\n    CFG 1.5 / 2.0\n    external shift 5\n    internal sigma_shift 4\n    large frame count\n    large resolution\n\n\nThat setup has too many unstable variables.\n\n* * *\n\n# 13. Short answer\n\nYour screenshot has three likely problems:\n\n\n    1. boundary should probably be 0.900, not 0.750, for Wan2.2 I2V\n    2. six steps is too low for judging the normal High/Low UNets\n    3. shift is likely being applied ambiguously through both ModelSamplingSD3 and WanMoeKSampler\n\n\nMy recommended reset for the normal High/Low branch:\n\n\n    Remove:\n      ModelSamplingSD3 nodes\n\n    WanMoeKSampler:\n      boundary: 0.900\n      steps: 12\n      cfg_high_noise: 3.0\n      cfg_low_noise: 3.0\n      sampler: euler\n      scheduler: simple\n      sigma_shift: 5.0\n      add_noise: enable\n      return_with_leftover_noise: disable\n\n    VAE:\n      wan_2.1_vae.safetensors\n\n    Test:\n      33 frames\n      512-640px long side\n      no LoRAs\n      no upscale\n      no interpolation\n      no face restore\n\n\nIf you are actually using Lightning/LightX2V LoRAs, use the strict distilled branch instead:\n\n\n    boundary: 0.900\n    steps: 4\n    cfg_high_noise: 1.0\n    cfg_low_noise: 1.0\n    sampler: euler\n    scheduler: simple\n    sigma_shift: 5.0\n    LoRA strength: 1.0\n    VAE: wan_2.1_vae.safetensors\n\n\nThe blur, crosshatching, and yellow flash are most likely from:\n\n\n    wrong boundary\n    + too few steps\n    + ambiguous/double shift\n    + possibly wrong VAE\n    + possibly mixing normal and distilled recipes\n\n\n* * *\n\n# 14. Useful references\n\n  * WanMoeKSampler README — boundary guidance and explanation that diffusion timestep is not denoising step.\n  * Wan2.2 I2V-A14B model card — official High-noise / Low-noise MoE model information.\n  * Wan2.2 official GitHub — official Wan2.2 repository and model family overview.\n  * ComfyUI official Wan2.2 workflow guide — official ComfyUI Wan2.2 workflow notes.\n  * ComfyUI Wan2.2 examples — model/VAE file guidance for Wan2.2 workflows.\n  * ComfyUI-GGUF — GGUF loading support for ComfyUI.\n  * QuantStack Wan2.2 I2V-A14B GGUF — GGUF quant size reference.\n  * city96 UMT5 XXL encoder GGUF — text encoder quantization reference.\n  * LightX2V Wan2.2 I2V A14B MoE Distill — distilled 4-step-style branch guidance.\n  * Wan2.2-Lightning GitHub — Lightning-style low-step Wan2.2 acceleration.\n\n",
  "title": "Wan2.2 i2v (clarifications needed regarding settings on low vram system)"
}