Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidhp52nbmpgcfx3czcaxtqelvau27tugc45kl2upjv7x4c4xylaj4",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mnqkxmezv6i2"
  },
  "path": "/t/trying-to-get-omnimattezero-installed-but-cant-find-the-models/176547#post_5",
  "publishedAt": "2026-06-08T00:39:04.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "ComfyUI_OmnimatteZero",
    "OmnimatteZero_node.py",
    "object_removal.py",
    "Diffusers memory optimization docs: VAE tiling"
  ],
  "textContent": "Nice. Looks like there may be a more suspicious culprit than offloading:\n\n* * *\n\nI think the faint grid is probably not caused by the GGUF file itself, and maybe not by `block_num` either. The first thing I would test is **VAE tiling**.\n\nIn the current ComfyUI_OmnimatteZero code, `block_num = 0` does not mean “disable all memory-saving behavior”. It only avoids the `apply_group_offloading(...)` branch. The fallback branch still calls `model.enable_model_cpu_offload()`.\n\nRelevant file:\n\nOmnimatteZero_node.py\n\nThe relevant logic is roughly:\n\n\n    if block_num > 0:\n        apply_group_offloading(\n            model.transformer,\n            onload_device=torch.device(\"cuda\"),\n            offload_type=\"block_level\",\n            num_blocks_per_group=block_num,\n        )\n    else:\n        model.enable_model_cpu_offload()\n\n\nSo the behavior is:\n\n`block_num` value | What happens | What it does _not_ mean\n---|---|---\n`> 0` | Uses Diffusers group offloading on the transformer | Not image tiling\n`0` | Skips group offloading, but still enables model CPU offload | Not “disable all offloading / tiling”\n\nThe grid artifact sounds more like a VAE tiling artifact. In object_removal.py, VAE tiling is enabled unconditionally:\n\n\n    pipe.vae.enable_tiling()\n\n\nThere is also another one for the upsample path:\n\n\n    pipe_upsample.vae.enable_tiling()\n\n\nDiffusers’ own docs describe VAE tiling as a memory-saving method that decodes the image in overlapping tiles, and they note that tile-to-tile tone variation can happen:\n\nDiffusers memory optimization docs: VAE tiling\n\nSo if you have a 32GB card, I would test disabling VAE tiling first.\n\n## Patch 1: disable VAE tiling\n\nOpen:\n\n\n    ComfyUI/custom_nodes/ComfyUI_OmnimatteZero/object_removal.py\n\n\nFind:\n\n\n    pipe.vae.enable_tiling()\n\n\nChange it to:\n\n\n    # pipe.vae.enable_tiling()\n\n\nAlso find:\n\n\n    pipe_upsample.vae.enable_tiling()\n\n\nChange it to:\n\n\n    # pipe_upsample.vae.enable_tiling()\n\n\nThen fully restart ComfyUI and test the same workflow again.\n\n## Patch 2: if you also want to avoid model CPU offload\n\nThis is optional. I would only try this after testing VAE tiling first.\n\nOpen:\n\n\n    ComfyUI/custom_nodes/ComfyUI_OmnimatteZero/OmnimatteZero_node.py\n\n\nFind this block:\n\n\n    if block_num > 0:\n        apply_group_offloading(\n            model.transformer,\n            onload_device=torch.device(\"cuda\"),\n            offload_type=\"block_level\",\n            num_blocks_per_group=block_num,\n        )\n    else:\n        model.enable_model_cpu_offload()\n\n\nFor a 32GB card, you can try replacing the `else` branch with `model.to(device)`:\n\n\n    if block_num > 0:\n        apply_group_offloading(\n            model.transformer,\n            onload_device=torch.device(\"cuda\"),\n            offload_type=\"block_level\",\n            num_blocks_per_group=block_num,\n        )\n    else:\n        model.to(device)\n\n\nThere are two similar blocks in the file, so check both:\n\n  1. the normal inference path;\n  2. the compose/background-replacement path.\n\n\n\nThe second one appears around the `compose_video(...)` path.\n\n## Test order I would use\n\nStep | Change | Reason\n---|---|---\n1 | Comment out `pipe.vae.enable_tiling()` | Most likely source of a visible grid\n2 | Also comment out `pipe_upsample.vae.enable_tiling()` if using the upsample path | Same reason, but only matters if that path is used\n3 | Keep `block_num = 0` | Avoid group offloading while testing\n4 | If the grid remains, replace `model.enable_model_cpu_offload()` with `model.to(device)` | Tests whether CPU/model offload is involved\n5 | If VRAM runs out, restore offloading or use a smaller GGUF quant | Q8_0 can still be heavy depending on resolution and frame count\n\n## Important caveat\n\nDisabling VAE tiling increases VRAM use. A 32GB card has a much better chance of handling it, but it can still OOM depending on:\n\n  * resolution;\n  * number of frames;\n  * whether Q8_0 / Q6_K / Q5_K_M / Q4_K_M is used;\n  * whether upsampling is enabled;\n  * whether compose mode / background replacement is enabled;\n  * how much VRAM ComfyUI already has occupied.\n\n\n\nSo I would not change everything at once. First test only this:\n\n\n    # pipe.vae.enable_tiling()\n    # pipe_upsample.vae.enable_tiling()\n\n\nIf that removes the grid, then the issue was probably VAE tiling rather than group offloading.",
  "title": "Trying to get omnimattezero installed but cant find the models"
}