Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibsltnmc4kifu4w7pmjjppo4rmfjh4wtxsyctz3clay2bmnadxznu",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmdbmzyhttx2"
  },
  "path": "/t/wan2-2-i2v-clarifications-needed-regarding-settings-on-low-vram-system/175884#post_16",
  "publishedAt": "2026-05-21T00:14:34.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "Diffusers img2img guide",
    "ComfyUI-Impact-Pack",
    "ComfyUI-Impact-Subpack",
    "Impact Pack detector tutorial",
    "BBOX Detector (SEGS)",
    "FaceDetailerPipe workflow index",
    "ComfyUI Face Detailer guide",
    "Improving faces with Impact-Pack Detailers",
    "Bingsu/adetailer",
    "ComfyUI Inpainting Workflow",
    "SEGS to MASK (combined)",
    "ComfyUI-Inpaint-CropAndStitch",
    "Comfy-Org crop-and-stitch nodes",
    "RunComfy: ComfyUI-Inpaint-CropAndStitch",
    "RunComfy: Inpaint Crop node",
    "Diffusers inpainting guide",
    "Impact Pack node list mirror",
    "Ultralytics assets",
    "RunComfy: Inpaint Crop",
    "Acly/comfyui-inpaint-nodes",
    "RunComfy: ComfyUI Inpaint Nodes",
    "Fooocus inpaint files",
    "Fooocus inpaint patch",
    "SDXL Inpainting 0.1",
    "ControlNet Canny SDXL",
    "Xinsir ControlNet Canny SDXL",
    "Xinsir ControlNet Tile SDXL",
    "Xinsir ControlNet Union SDXL",
    "Xinsir ControlNetPlus GitHub",
    "controlnetXL_inpaint",
    "controlnet-inpaint-dreamer-sdxl",
    "Diffusers ControlNet with SDXL docs"
  ],
  "textContent": "This applies to I2I in general, but it is difficult to maintain the subject’s identity solely through prompt control. This is because it is hard for the AI to understand what should be redrawn and what should be retained.\n\nEspecially in the case of simple, standard I2I, the original image is treated merely as a reference. With standard Image-to-Image, the process generally focuses on generating a different image with a similar composition. In cases where precision isn’t critical, standard inpainting without a mask might work.\n\nHowever, if you want high accuracy, more precise control is desirable. (For example, if you “absolutely” want to preserve a face.) A common method is to create a mask excluding the face and perform inpainting, but since manually creating a mask (which you could even do in MSPaint…) is a hassle, it’s important to figure out how to automate the process by having the AI handle face detection and other tasks. There are plenty of components available online for this purpose, but the challenge lies in how to combine them…\n\n* * *\n\nShort answer:\n\nIf img2img turns the person into someone completely different, I would stop treating that mainly as a prompt-obedience problem.\n\nThat is probably more like an identity-rewrite problem.\n\nWhole-image img2img is still regeneration. “same person” in the prompt is not an identity lock. If SDXL is allowed to touch the whole image, it can improve the frame, but it can also reinterpret the face.\n\nThe Diffusers img2img guide is useful background here: img2img starts from an initial image, adds noise, and denoises toward a new result. That is not the same thing as Photoshop-style editing.\n\nSo I would not keep searching whole-image img2img settings first.\n\nI would change the workflow structure.\n\n> Do not use SDXL to redraw the face first.\n>  Use detection to protect the identity-critical area, then inpaint everything else.\n\n## The practical idea\n\nInstead of this:\n\n\n    source frame\n    → whole-image SDXL img2img\n    → hope the prompt preserves identity\n    → I2V\n\n\nI would try this:\n\n\n    source frame\n    → detect face or full person\n    → make a protection mask\n    → grow/dilate the mask\n    → blur/feather the mask\n    → invert the mask\n    → inpaint only the non-face or non-person area\n    → stitch back into the original frame\n    → save fixed PNG\n    → feed fixed PNG to I2V\n\n\nThis changes the task from:\n\n\n    make img2img preserve identity\n\n\nto:\n\n\n    remove the identity area from the repair target\n\n\nThat is a much easier problem.\n\n## Why I would not start with FaceDetailer as a face redraw tool\n\nFaceDetailer is useful, but I would not start by letting it redraw the face.\n\nFor this specific failure mode, I would use the detector part only.\n\nSomething like:\n\n\n    YOLO / Impact Pack detects the face\n    → face mask\n    → grow / blur\n    → invert\n    → SDXL repairs everything except the face\n\n\nIn other words, FaceDetailer-style tools are useful here because they can locate the face.\n\nNot because we want SDXL to repaint the face.\n\nGood references for this detector/detailer ecosystem:\n\n  * ComfyUI-Impact-Pack\n  * ComfyUI-Impact-Subpack\n  * Impact Pack detector tutorial\n  * BBOX Detector (SEGS)\n  * FaceDetailerPipe workflow index\n  * ComfyUI Face Detailer guide\n  * Improving faces with Impact-Pack Detailers\n\n\n\nImportant note:\n\nComfyUI-Impact-Pack says `UltralyticsDetectorProvider` is not part of Impact Pack itself anymore. For YOLO / Ultralytics detection, install ComfyUI-Impact-Subpack too.\n\nThe Subpack README also says Ultralytics models should be placed under:\n\n\n    models/ultralytics/bbox\n    models/ultralytics/segm\n\n\ndepending on the model type.\n\nFor face/person detection models, Bingsu/adetailer is a common source.\n\n## Minimal face-protect workflow\n\nThis is the first workflow I would try.\n\n\n    Load Image\n    → UltralyticsDetectorProvider\n    → YOLO face detector, for example face_yolov8m\n    → BBOX Detector / Simple Detector\n    → face mask\n    → grow/dilate mask\n    → blur/feather mask\n    → invert mask\n    → Inpaint Crop\n    → SDXL inpaint sampler\n    → Inpaint Stitch\n    → Save fixed PNG\n    → use that PNG as I2V input\n\n\nMask meaning:\n\n\n    white = repair this\n    black = preserve this\n\n\nSo if the detector gives you:\n\n\n    white = face\n    black = everything else\n\n\nthen invert it.\n\nAfter inversion:\n\n\n    white = non-face area\n    black = protected face area\n\n\nNow SDXL is asked to repair the frame while not touching the face.\n\nThe basic ComfyUI inpaint concept is covered in the official ComfyUI Inpainting Workflow. That workflow uses a manual mask, but conceptually the manual mask can be replaced with an automatically generated detector mask.\n\nIf you use Impact Pack SEGS, the shape is usually:\n\n\n    UltralyticsDetectorProvider\n    → BBOX Detector (SEGS) or Simple Detector (SEGS)\n    → SEGS to MASK (combined)\n    → preview mask\n    → grow/blur\n    → invert\n    → inpaint\n\n\nUseful node references:\n\n  * BBOX Detector (SEGS)\n  * SEGS to MASK (combined)\n  * Impact Pack detector tutorial\n\n\n\n## Face-protect vs person-protect\n\nI would probably make two versions.\n\nMode | Protects | Repairs | Use when\n---|---|---|---\nface-protect | face / identity center | background, clothing, non-face defects | the face is the main identity risk, but clothing/background may need repair\nperson-protect | whole person | mostly background | hair, clothing, body shape, pose, or full identity must not change\n\nFace-protect route:\n\n\n    face detector\n    → face mask\n    → grow/blur\n    → invert\n    → inpaint non-face area\n\n\nPerson-protect route:\n\n\n    person segmentation detector\n    → person mask\n    → grow/blur\n    → invert\n    → inpaint background only\n\n\nThe tradeoff is simple:\n\n\n    face-protect = more repair freedom, more risk to hair/clothes/body\n    person-protect = safer identity/clothing preservation, less repair freedom\n\n\nIf the person is changing too much, use person-protect mode.\n\nIf only the face is changing, face-protect mode may be enough.\n\nFor person masks, look at segmentation detector routes in Impact Pack detector tutorial, and put segmentation models under `models/ultralytics/segm` as described in ComfyUI-Impact-Subpack.\n\n## Do not use the raw mask directly\n\nA raw face mask is usually too tight.\n\nIt may protect the middle of the face, but not enough of:\n\n\n    face outline\n    hairline\n    ears\n    chin\n    neck\n    jaw shadow\n    skin/background transition\n\n\nSo I would not do:\n\n\n    face mask\n    → invert\n    → inpaint\n\n\nI would do:\n\n\n    face mask\n    → grow/dilate\n    → blur/feather\n    → invert\n    → inpaint\n\n\nPossible starting values:\n\n\n    face mask grow/dilate: 24-64 px\n    face mask blur/feather: 12-32 px\n    person mask grow/dilate: 16-48 px\n    person mask blur/feather: 8-24 px\n\n\nThose are not magic values. They are just a reasonable diagnostic range.\n\nThe mask should be previewed before sampling.\n\n## Why I would use Crop & Stitch\n\nI would strongly consider using Inpaint Crop & Stitch rather than sampling the entire frame.\n\nThe reason is simple:\n\n\n    we do not want to resample the whole image\n    we only want to repair the selected area\n    then stitch that repair back into the original frame\n\n\nUseful node/packages:\n\n  * ComfyUI-Inpaint-CropAndStitch\n  * Comfy-Org crop-and-stitch nodes\n  * RunComfy: ComfyUI-Inpaint-CropAndStitch\n  * RunComfy: Inpaint Crop node\n\n\n\nThe important part is that Crop & Stitch can crop around the masked area, sample that region, then stitch it back while preserving the unmasked area.\n\nThat is exactly the kind of behavior I would want before I2V.\n\nA useful comment I have seen summarized the same idea as:\n\n\n    Ultralytics detects BBOX/SEGM\n    → Detector node gets SEGS/MASK\n    → convert SEGS to mask if needed\n    → connect to Inpaint Crop\n    → KSampler\n    → Inpaint Stitch\n\n\nThat is basically the route I would try here.\n\n## Suggested first test\n\nDo not put Wan, ControlNet, SAM, IPAdapter, FaceID, upscalers, and inpaint all in one big workflow at first.\n\nFirst test the still image repair step only.\n\nUse one source image and compare:\n\n\n    A. original frame\n    B. whole-image img2img result\n    C. face-protect inpaint result\n    D. person-protect inpaint result\n\n\nSuccess condition for this stage:\n\n\n    the fixed PNG still looks like the same person\n\n\nNot:\n\n\n    the prompt was perfectly followed\n\n\nNot yet:\n\n\n    the final I2V clip is perfect\n\n\nFirst prove that the still PNG is not being identity-rewritten.\n\n## Starting SDXL inpaint settings\n\nFor identity-preserving prep, I would start conservative.\n\n\n    denoise: 0.12 / 0.18 / 0.24\n    steps: 20-30\n    cfg: 3-5\n    sampler: whatever is stable in your SDXL workflow\n\n\nI would avoid starting with high denoise.\n\nIf denoise is too high, the result may look cleaner but less like the source.\n\nThat is exactly the failure mode we are trying to avoid.\n\n## Suggested prompt for the SDXL repair pass\n\nFor this pass, I would not use glamour / beauty / cinematic language.\n\nPositive:\n\n\n    realistic photo cleanup, preserve the original photo, same lighting, same camera angle, same clothing, same background structure, natural texture, realistic details, no stylization\n\n\nNegative:\n\n\n    different person, changed face, changed identity, changed hairstyle, changed clothing, changed lighting, beauty filter, airbrushed skin, plastic skin, waxy skin, cinematic lighting, dreamlike, over-smoothed, cartoon, painting, 3d render\n\n\nIf you are using face-protect mode, the prompt is mostly for the non-face repair area.\n\nThe face should be protected by the mask, not by the prompt.\n\n## Useful ComfyUI parts / references\n\nBasic inpainting:\n\nComfyUI Inpainting Workflow\n\nThis is the basic official inpaint workflow. It uses manual masks, but conceptually you can replace the manual mask with an automatically generated face/person mask.\n\nBackground concept:\n\nDiffusers img2img guide\n\nUseful if you want the conceptual reason whole-image img2img can drift.\n\nDiffusers inpainting guide\n\nUseful if you want the conceptual difference between whole-image img2img and masked repair.\n\nImpact Pack / detection:\n\nComfyUI-Impact-Pack\n\nImpact Pack has detector/detailer/upscaler/pipe nodes. Important note: `UltralyticsDetectorProvider` is not part of Impact Pack itself anymore. Install Impact Subpack too.\n\nComfyUI-Impact-Subpack\n\nThis provides `UltralyticsDetectorProvider`, which loads YOLO / Ultralytics models and provides `BBOX_DETECTOR` / `SEGM_DETECTOR`.\n\nImpact Pack detector tutorial\n\nThis explains the detector side: BBOX, SEGM, SAM, and SEGS.\n\nBBOX Detector (SEGS)\n\nUseful for understanding the face/person detection step.\n\nSEGS to MASK (combined)\n\nUseful if your detector route returns SEGS and you need a normal mask.\n\nImpact Pack node list mirror\n\nUseful for checking exact node names.\n\nYOLO / detection models:\n\nBingsu/adetailer\n\nCommon source for face/person/clothing detection models.\n\nUltralytics assets\n\nGeneral Ultralytics model assets.\n\nWorkflow examples / wiring examples:\n\nFaceDetailerPipe workflow index\n\nI would not necessarily use FaceDetailer to redraw the face here, but these workflows can be useful for learning the YOLO / detector / pipe wiring.\n\nComfyUI Face Detailer guide\n\nAgain, I would treat this mainly as a guide to the detector/detailer ecosystem, not as the first thing to use for identity preservation.\n\nImproving faces with Impact-Pack Detailers\n\nUseful background for Impact Pack detailer workflows.\n\nCrop and stitch:\n\nComfyUI-Inpaint-CropAndStitch\n\nThis is probably one of the most relevant pieces for this use case.\n\nComfy-Org crop-and-stitch nodes\n\nSame general idea: crop before sampling, stitch back afterward, preserve unmasked areas.\n\nRunComfy: ComfyUI-Inpaint-CropAndStitch\n\nReadable node overview.\n\nRunComfy: Inpaint Crop\n\nUseful if you want the specific node page.\n\n## Optional second stage\n\nOnly after the mask workflow works, I would test extra control.\n\nIf normal SDXL inpaint is not good enough:\n\n### Option 1: Fooocus-style inpaint support\n\nAcly/comfyui-inpaint-nodes\n\nThis can add Fooocus / LaMa / MAT-style inpaint tools to ComfyUI.\n\nRunComfy: ComfyUI Inpaint Nodes\n\nReadable guide for the Acly inpaint nodes.\n\nFooocus inpaint files\n\nFooocus inpaint model files.\n\nFooocus inpaint patch\n\nSpecific Fooocus inpaint patch file.\n\n### Option 2: SDXL inpainting model\n\nSDXL Inpainting 0.1\n\nThis is a dedicated SDXL inpainting model.\n\n### Option 3: ControlNet\n\nI would treat ControlNet as a second-stage fix, not the first fix.\n\nFirst solve the mask design.\n\nThen:\n\n\n    Tile ControlNet = if texture/clothing/background changes too much\n    Canny ControlNet = if outlines drift too much\n    Inpaint ControlNet = if fill/boundary quality is poor\n    Union ControlNet = more advanced route, more modes, more complexity\n\n\nPossible references:\n\nControlNet Canny SDXL\n\nXinsir ControlNet Canny SDXL\n\nXinsir ControlNet Tile SDXL\n\nXinsir ControlNet Union SDXL\n\nXinsir ControlNetPlus GitHub\n\ncontrolnetXL_inpaint\n\ncontrolnet-inpaint-dreamer-sdxl\n\nDiffusers ControlNet with SDXL docs\n\nI would not start with all of these.\n\nFor this case, I would probably test in this order:\n\n\n    1. face/person-protect mask without ControlNet\n    2. Crop & Stitch\n    3. Tile ControlNet if texture changes too much\n    4. Canny ControlNet if shape drifts too much\n    5. inpaint ControlNet or Fooocus inpaint if fill quality is weak\n\n\n## Optional third stage\n\nIf you still need stronger identity preservation, then I would start looking at identity-reference systems.\n\nBut I would not start there.\n\nOnly after the simpler mask route fails would I look at things like:\n\n\n    IPAdapter FaceID\n    InstantID\n    LivePortrait\n    manual mask correction\n\n\nThe reason is simple:\n\n\n    the first problem to solve is not \"how do I force SDXL to know the person?\"\n    the first problem is \"how do I stop SDXL from touching the person?\"\n\n\n## Final rule before I2V\n\nOnly feed the PNG to I2V after the still image still looks like the same person.\n\nIf the still-image prep already changes the person, I2V cannot fix that.\n\nIt will just animate the changed person.\n\nSo the diagnostic order should be:\n\n\n    1. Can I make a fixed still PNG that preserves identity?\n    2. Does that fixed PNG animate better than the original frame?\n    3. Only then tune the I2V settings.\n\n\nFor your specific failure mode, I would debug the still-image prep first.\n\nNot the Wan settings first.",
  "title": "Wan2.2 i2v (clarifications needed regarding settings on low vram system)"
}