Wan2.2 i2v (clarifications needed regarding settings on low vram system)
This applies to I2I in general, but it is difficult to maintain the subject’s identity solely through prompt control. This is because it is hard for the AI to understand what should be redrawn and what should be retained.
Especially in the case of simple, standard I2I, the original image is treated merely as a reference. With standard Image-to-Image, the process generally focuses on generating a different image with a similar composition. In cases where precision isn’t critical, standard inpainting without a mask might work.
However, if you want high accuracy, more precise control is desirable. (For example, if you “absolutely” want to preserve a face.) A common method is to create a mask excluding the face and perform inpainting, but since manually creating a mask (which you could even do in MSPaint…) is a hassle, it’s important to figure out how to automate the process by having the AI handle face detection and other tasks. There are plenty of components available online for this purpose, but the challenge lies in how to combine them…
Short answer:
If img2img turns the person into someone completely different, I would stop treating that mainly as a prompt-obedience problem.
That is probably more like an identity-rewrite problem.
Whole-image img2img is still regeneration. “same person” in the prompt is not an identity lock. If SDXL is allowed to touch the whole image, it can improve the frame, but it can also reinterpret the face.
The Diffusers img2img guide is useful background here: img2img starts from an initial image, adds noise, and denoises toward a new result. That is not the same thing as Photoshop-style editing.
So I would not keep searching whole-image img2img settings first.
I would change the workflow structure.
Do not use SDXL to redraw the face first. Use detection to protect the identity-critical area, then inpaint everything else.
The practical idea
Instead of this:
source frame
→ whole-image SDXL img2img
→ hope the prompt preserves identity
→ I2V
I would try this:
source frame
→ detect face or full person
→ make a protection mask
→ grow/dilate the mask
→ blur/feather the mask
→ invert the mask
→ inpaint only the non-face or non-person area
→ stitch back into the original frame
→ save fixed PNG
→ feed fixed PNG to I2V
This changes the task from:
make img2img preserve identity
to:
remove the identity area from the repair target
That is a much easier problem.
Why I would not start with FaceDetailer as a face redraw tool
FaceDetailer is useful, but I would not start by letting it redraw the face.
For this specific failure mode, I would use the detector part only.
Something like:
YOLO / Impact Pack detects the face
→ face mask
→ grow / blur
→ invert
→ SDXL repairs everything except the face
In other words, FaceDetailer-style tools are useful here because they can locate the face.
Not because we want SDXL to repaint the face.
Good references for this detector/detailer ecosystem:
- ComfyUI-Impact-Pack
- ComfyUI-Impact-Subpack
- Impact Pack detector tutorial
- BBOX Detector (SEGS)
- FaceDetailerPipe workflow index
- ComfyUI Face Detailer guide
- Improving faces with Impact-Pack Detailers
Important note:
ComfyUI-Impact-Pack says UltralyticsDetectorProvider is not part of Impact Pack itself anymore. For YOLO / Ultralytics detection, install ComfyUI-Impact-Subpack too.
The Subpack README also says Ultralytics models should be placed under:
models/ultralytics/bbox
models/ultralytics/segm
depending on the model type.
For face/person detection models, Bingsu/adetailer is a common source.
Minimal face-protect workflow
This is the first workflow I would try.
Load Image
→ UltralyticsDetectorProvider
→ YOLO face detector, for example face_yolov8m
→ BBOX Detector / Simple Detector
→ face mask
→ grow/dilate mask
→ blur/feather mask
→ invert mask
→ Inpaint Crop
→ SDXL inpaint sampler
→ Inpaint Stitch
→ Save fixed PNG
→ use that PNG as I2V input
Mask meaning:
white = repair this
black = preserve this
So if the detector gives you:
white = face
black = everything else
then invert it.
After inversion:
white = non-face area
black = protected face area
Now SDXL is asked to repair the frame while not touching the face.
The basic ComfyUI inpaint concept is covered in the official ComfyUI Inpainting Workflow. That workflow uses a manual mask, but conceptually the manual mask can be replaced with an automatically generated detector mask.
If you use Impact Pack SEGS, the shape is usually:
UltralyticsDetectorProvider
→ BBOX Detector (SEGS) or Simple Detector (SEGS)
→ SEGS to MASK (combined)
→ preview mask
→ grow/blur
→ invert
→ inpaint
Useful node references:
- BBOX Detector (SEGS)
- SEGS to MASK (combined)
- Impact Pack detector tutorial
Face-protect vs person-protect
I would probably make two versions.
| Mode | Protects | Repairs | Use when |
|---|---|---|---|
| face-protect | face / identity center | background, clothing, non-face defects | the face is the main identity risk, but clothing/background may need repair |
| person-protect | whole person | mostly background | hair, clothing, body shape, pose, or full identity must not change |
Face-protect route:
face detector
→ face mask
→ grow/blur
→ invert
→ inpaint non-face area
Person-protect route:
person segmentation detector
→ person mask
→ grow/blur
→ invert
→ inpaint background only
The tradeoff is simple:
face-protect = more repair freedom, more risk to hair/clothes/body
person-protect = safer identity/clothing preservation, less repair freedom
If the person is changing too much, use person-protect mode.
If only the face is changing, face-protect mode may be enough.
For person masks, look at segmentation detector routes in Impact Pack detector tutorial, and put segmentation models under models/ultralytics/segm as described in ComfyUI-Impact-Subpack.
Do not use the raw mask directly
A raw face mask is usually too tight.
It may protect the middle of the face, but not enough of:
face outline
hairline
ears
chin
neck
jaw shadow
skin/background transition
So I would not do:
face mask
→ invert
→ inpaint
I would do:
face mask
→ grow/dilate
→ blur/feather
→ invert
→ inpaint
Possible starting values:
face mask grow/dilate: 24-64 px
face mask blur/feather: 12-32 px
person mask grow/dilate: 16-48 px
person mask blur/feather: 8-24 px
Those are not magic values. They are just a reasonable diagnostic range.
The mask should be previewed before sampling.
Why I would use Crop & Stitch
I would strongly consider using Inpaint Crop & Stitch rather than sampling the entire frame.
The reason is simple:
we do not want to resample the whole image
we only want to repair the selected area
then stitch that repair back into the original frame
Useful node/packages:
- ComfyUI-Inpaint-CropAndStitch
- Comfy-Org crop-and-stitch nodes
- RunComfy: ComfyUI-Inpaint-CropAndStitch
- RunComfy: Inpaint Crop node
The important part is that Crop & Stitch can crop around the masked area, sample that region, then stitch it back while preserving the unmasked area.
That is exactly the kind of behavior I would want before I2V.
A useful comment I have seen summarized the same idea as:
Ultralytics detects BBOX/SEGM
→ Detector node gets SEGS/MASK
→ convert SEGS to mask if needed
→ connect to Inpaint Crop
→ KSampler
→ Inpaint Stitch
That is basically the route I would try here.
Suggested first test
Do not put Wan, ControlNet, SAM, IPAdapter, FaceID, upscalers, and inpaint all in one big workflow at first.
First test the still image repair step only.
Use one source image and compare:
A. original frame
B. whole-image img2img result
C. face-protect inpaint result
D. person-protect inpaint result
Success condition for this stage:
the fixed PNG still looks like the same person
Not:
the prompt was perfectly followed
Not yet:
the final I2V clip is perfect
First prove that the still PNG is not being identity-rewritten.
Starting SDXL inpaint settings
For identity-preserving prep, I would start conservative.
denoise: 0.12 / 0.18 / 0.24
steps: 20-30
cfg: 3-5
sampler: whatever is stable in your SDXL workflow
I would avoid starting with high denoise.
If denoise is too high, the result may look cleaner but less like the source.
That is exactly the failure mode we are trying to avoid.
Suggested prompt for the SDXL repair pass
For this pass, I would not use glamour / beauty / cinematic language.
Positive:
realistic photo cleanup, preserve the original photo, same lighting, same camera angle, same clothing, same background structure, natural texture, realistic details, no stylization
Negative:
different person, changed face, changed identity, changed hairstyle, changed clothing, changed lighting, beauty filter, airbrushed skin, plastic skin, waxy skin, cinematic lighting, dreamlike, over-smoothed, cartoon, painting, 3d render
If you are using face-protect mode, the prompt is mostly for the non-face repair area.
The face should be protected by the mask, not by the prompt.
Useful ComfyUI parts / references
Basic inpainting:
ComfyUI Inpainting Workflow
This is the basic official inpaint workflow. It uses manual masks, but conceptually you can replace the manual mask with an automatically generated face/person mask.
Background concept:
Diffusers img2img guide
Useful if you want the conceptual reason whole-image img2img can drift.
Diffusers inpainting guide
Useful if you want the conceptual difference between whole-image img2img and masked repair.
Impact Pack / detection:
ComfyUI-Impact-Pack
Impact Pack has detector/detailer/upscaler/pipe nodes. Important note: UltralyticsDetectorProvider is not part of Impact Pack itself anymore. Install Impact Subpack too.
ComfyUI-Impact-Subpack
This provides UltralyticsDetectorProvider, which loads YOLO / Ultralytics models and provides BBOX_DETECTOR / SEGM_DETECTOR.
Impact Pack detector tutorial
This explains the detector side: BBOX, SEGM, SAM, and SEGS.
BBOX Detector (SEGS)
Useful for understanding the face/person detection step.
SEGS to MASK (combined)
Useful if your detector route returns SEGS and you need a normal mask.
Impact Pack node list mirror
Useful for checking exact node names.
YOLO / detection models:
Bingsu/adetailer
Common source for face/person/clothing detection models.
Ultralytics assets
General Ultralytics model assets.
Workflow examples / wiring examples:
FaceDetailerPipe workflow index
I would not necessarily use FaceDetailer to redraw the face here, but these workflows can be useful for learning the YOLO / detector / pipe wiring.
ComfyUI Face Detailer guide
Again, I would treat this mainly as a guide to the detector/detailer ecosystem, not as the first thing to use for identity preservation.
Improving faces with Impact-Pack Detailers
Useful background for Impact Pack detailer workflows.
Crop and stitch:
ComfyUI-Inpaint-CropAndStitch
This is probably one of the most relevant pieces for this use case.
Comfy-Org crop-and-stitch nodes
Same general idea: crop before sampling, stitch back afterward, preserve unmasked areas.
RunComfy: ComfyUI-Inpaint-CropAndStitch
Readable node overview.
RunComfy: Inpaint Crop
Useful if you want the specific node page.
Optional second stage
Only after the mask workflow works, I would test extra control.
If normal SDXL inpaint is not good enough:
Option 1: Fooocus-style inpaint support
Acly/comfyui-inpaint-nodes
This can add Fooocus / LaMa / MAT-style inpaint tools to ComfyUI.
RunComfy: ComfyUI Inpaint Nodes
Readable guide for the Acly inpaint nodes.
Fooocus inpaint files
Fooocus inpaint model files.
Fooocus inpaint patch
Specific Fooocus inpaint patch file.
Option 2: SDXL inpainting model
SDXL Inpainting 0.1
This is a dedicated SDXL inpainting model.
Option 3: ControlNet
I would treat ControlNet as a second-stage fix, not the first fix.
First solve the mask design.
Then:
Tile ControlNet = if texture/clothing/background changes too much
Canny ControlNet = if outlines drift too much
Inpaint ControlNet = if fill/boundary quality is poor
Union ControlNet = more advanced route, more modes, more complexity
Possible references:
ControlNet Canny SDXL
Xinsir ControlNet Canny SDXL
Xinsir ControlNet Tile SDXL
Xinsir ControlNet Union SDXL
Xinsir ControlNetPlus GitHub
controlnetXL_inpaint
controlnet-inpaint-dreamer-sdxl
Diffusers ControlNet with SDXL docs
I would not start with all of these.
For this case, I would probably test in this order:
1. face/person-protect mask without ControlNet
2. Crop & Stitch
3. Tile ControlNet if texture changes too much
4. Canny ControlNet if shape drifts too much
5. inpaint ControlNet or Fooocus inpaint if fill quality is weak
Optional third stage
If you still need stronger identity preservation, then I would start looking at identity-reference systems.
But I would not start there.
Only after the simpler mask route fails would I look at things like:
IPAdapter FaceID
InstantID
LivePortrait
manual mask correction
The reason is simple:
the first problem to solve is not "how do I force SDXL to know the person?"
the first problem is "how do I stop SDXL from touching the person?"
Final rule before I2V
Only feed the PNG to I2V after the still image still looks like the same person.
If the still-image prep already changes the person, I2V cannot fix that.
It will just animate the changed person.
So the diagnostic order should be:
1. Can I make a fixed still PNG that preserves identity?
2. Does that fixed PNG animate better than the original frame?
3. Only then tune the I2V settings.
For your specific failure mode, I would debug the still-image prep first.
Not the Wan settings first.
Discussion in the ATmosphere