External Publication

Wan2.2 i2v (clarifications needed regarding settings on low vram system)

Hugging Face Forums [Unofficial] May 21, 2026

This applies to I2I in general, but it is difficult to maintain the subject’s identity solely through prompt control. This is because it is hard for the AI to understand what should be redrawn and what should be retained.

Especially in the case of simple, standard I2I, the original image is treated merely as a reference. With standard Image-to-Image, the process generally focuses on generating a different image with a similar composition. In cases where precision isn’t critical, standard inpainting without a mask might work.

However, if you want high accuracy, more precise control is desirable. (For example, if you “absolutely” want to preserve a face.) A common method is to create a mask excluding the face and perform inpainting, but since manually creating a mask (which you could even do in MSPaint…) is a hassle, it’s important to figure out how to automate the process by having the AI handle face detection and other tasks. There are plenty of components available online for this purpose, but the challenge lies in how to combine them…

Short answer:

If img2img turns the person into someone completely different, I would stop treating that mainly as a prompt-obedience problem.

That is probably more like an identity-rewrite problem.

Whole-image img2img is still regeneration. “same person” in the prompt is not an identity lock. If SDXL is allowed to touch the whole image, it can improve the frame, but it can also reinterpret the face.

The Diffusers img2img guide is useful background here: img2img starts from an initial image, adds noise, and denoises toward a new result. That is not the same thing as Photoshop-style editing.

So I would not keep searching whole-image img2img settings first.

I would change the workflow structure.

Do not use SDXL to redraw the face first. Use detection to protect the identity-critical area, then inpaint everything else.

The practical idea

Instead of this:

source frame
→ whole-image SDXL img2img
→ hope the prompt preserves identity
→ I2V

I would try this:

source frame
→ detect face or full person
→ make a protection mask
→ grow/dilate the mask
→ blur/feather the mask
→ invert the mask
→ inpaint only the non-face or non-person area
→ stitch back into the original frame
→ save fixed PNG
→ feed fixed PNG to I2V

This changes the task from:

make img2img preserve identity

to:

remove the identity area from the repair target

That is a much easier problem.

Why I would not start with FaceDetailer as a face redraw tool

FaceDetailer is useful, but I would not start by letting it redraw the face.

For this specific failure mode, I would use the detector part only.

Something like:

YOLO / Impact Pack detects the face
→ face mask
→ grow / blur
→ invert
→ SDXL repairs everything except the face

In other words, FaceDetailer-style tools are useful here because they can locate the face.

Not because we want SDXL to repaint the face.

Good references for this detector/detailer ecosystem:

ComfyUI-Impact-Pack
ComfyUI-Impact-Subpack
Impact Pack detector tutorial
BBOX Detector (SEGS)
FaceDetailerPipe workflow index
ComfyUI Face Detailer guide
Improving faces with Impact-Pack Detailers

Important note:

ComfyUI-Impact-Pack says UltralyticsDetectorProvider is not part of Impact Pack itself anymore. For YOLO / Ultralytics detection, install ComfyUI-Impact-Subpack too.

The Subpack README also says Ultralytics models should be placed under:

models/ultralytics/bbox
models/ultralytics/segm

depending on the model type.

For face/person detection models, Bingsu/adetailer is a common source.

Minimal face-protect workflow

This is the first workflow I would try.

Load Image
→ UltralyticsDetectorProvider
→ YOLO face detector, for example face_yolov8m
→ BBOX Detector / Simple Detector
→ face mask
→ grow/dilate mask
→ blur/feather mask
→ invert mask
→ Inpaint Crop
→ SDXL inpaint sampler
→ Inpaint Stitch
→ Save fixed PNG
→ use that PNG as I2V input

Mask meaning:

white = repair this
black = preserve this

So if the detector gives you:

white = face
black = everything else

then invert it.

After inversion:

white = non-face area
black = protected face area

Now SDXL is asked to repair the frame while not touching the face.

The basic ComfyUI inpaint concept is covered in the official ComfyUI Inpainting Workflow. That workflow uses a manual mask, but conceptually the manual mask can be replaced with an automatically generated detector mask.

If you use Impact Pack SEGS, the shape is usually:

UltralyticsDetectorProvider
→ BBOX Detector (SEGS) or Simple Detector (SEGS)
→ SEGS to MASK (combined)
→ preview mask
→ grow/blur
→ invert
→ inpaint

Useful node references:

BBOX Detector (SEGS)
SEGS to MASK (combined)
Impact Pack detector tutorial

Face-protect vs person-protect

I would probably make two versions.

Mode	Protects	Repairs	Use when
face-protect	face / identity center	background, clothing, non-face defects	the face is the main identity risk, but clothing/background may need repair
person-protect	whole person	mostly background	hair, clothing, body shape, pose, or full identity must not change

Face-protect route:

face detector
→ face mask
→ grow/blur
→ invert
→ inpaint non-face area

Person-protect route:

person segmentation detector
→ person mask
→ grow/blur
→ invert
→ inpaint background only

The tradeoff is simple:

face-protect = more repair freedom, more risk to hair/clothes/body
person-protect = safer identity/clothing preservation, less repair freedom

If the person is changing too much, use person-protect mode.

If only the face is changing, face-protect mode may be enough.

For person masks, look at segmentation detector routes in Impact Pack detector tutorial, and put segmentation models under models/ultralytics/segm as described in ComfyUI-Impact-Subpack.

Do not use the raw mask directly

A raw face mask is usually too tight.

It may protect the middle of the face, but not enough of:

face outline
hairline
ears
chin
neck
jaw shadow
skin/background transition

So I would not do:

face mask
→ invert
→ inpaint

I would do:

face mask
→ grow/dilate
→ blur/feather
→ invert
→ inpaint

Possible starting values:

face mask grow/dilate: 24-64 px
face mask blur/feather: 12-32 px
person mask grow/dilate: 16-48 px
person mask blur/feather: 8-24 px

Those are not magic values. They are just a reasonable diagnostic range.

The mask should be previewed before sampling.

Why I would use Crop & Stitch

I would strongly consider using Inpaint Crop & Stitch rather than sampling the entire frame.

The reason is simple:

we do not want to resample the whole image
we only want to repair the selected area
then stitch that repair back into the original frame

Useful node/packages:

ComfyUI-Inpaint-CropAndStitch
Comfy-Org crop-and-stitch nodes
RunComfy: ComfyUI-Inpaint-CropAndStitch
RunComfy: Inpaint Crop node

The important part is that Crop & Stitch can crop around the masked area, sample that region, then stitch it back while preserving the unmasked area.

That is exactly the kind of behavior I would want before I2V.

A useful comment I have seen summarized the same idea as:

Ultralytics detects BBOX/SEGM
→ Detector node gets SEGS/MASK
→ convert SEGS to mask if needed
→ connect to Inpaint Crop
→ KSampler
→ Inpaint Stitch

That is basically the route I would try here.

Suggested first test

Do not put Wan, ControlNet, SAM, IPAdapter, FaceID, upscalers, and inpaint all in one big workflow at first.

First test the still image repair step only.

Use one source image and compare:

A. original frame
B. whole-image img2img result
C. face-protect inpaint result
D. person-protect inpaint result

Success condition for this stage:

the fixed PNG still looks like the same person

Not:

the prompt was perfectly followed

Not yet:

the final I2V clip is perfect

First prove that the still PNG is not being identity-rewritten.

Starting SDXL inpaint settings

For identity-preserving prep, I would start conservative.

denoise: 0.12 / 0.18 / 0.24
steps: 20-30
cfg: 3-5
sampler: whatever is stable in your SDXL workflow

I would avoid starting with high denoise.

If denoise is too high, the result may look cleaner but less like the source.

That is exactly the failure mode we are trying to avoid.

Suggested prompt for the SDXL repair pass

For this pass, I would not use glamour / beauty / cinematic language.

Positive:

realistic photo cleanup, preserve the original photo, same lighting, same camera angle, same clothing, same background structure, natural texture, realistic details, no stylization

Negative:

different person, changed face, changed identity, changed hairstyle, changed clothing, changed lighting, beauty filter, airbrushed skin, plastic skin, waxy skin, cinematic lighting, dreamlike, over-smoothed, cartoon, painting, 3d render

If you are using face-protect mode, the prompt is mostly for the non-face repair area.

The face should be protected by the mask, not by the prompt.

Useful ComfyUI parts / references

Basic inpainting:

ComfyUI Inpainting Workflow

This is the basic official inpaint workflow. It uses manual masks, but conceptually you can replace the manual mask with an automatically generated face/person mask.

Background concept:

Diffusers img2img guide

Useful if you want the conceptual reason whole-image img2img can drift.

Diffusers inpainting guide

Useful if you want the conceptual difference between whole-image img2img and masked repair.

Impact Pack / detection:

ComfyUI-Impact-Pack

Impact Pack has detector/detailer/upscaler/pipe nodes. Important note: UltralyticsDetectorProvider is not part of Impact Pack itself anymore. Install Impact Subpack too.

ComfyUI-Impact-Subpack

This provides UltralyticsDetectorProvider, which loads YOLO / Ultralytics models and provides BBOX_DETECTOR / SEGM_DETECTOR.

Impact Pack detector tutorial

This explains the detector side: BBOX, SEGM, SAM, and SEGS.

BBOX Detector (SEGS)

Useful for understanding the face/person detection step.

SEGS to MASK (combined)

Useful if your detector route returns SEGS and you need a normal mask.

Impact Pack node list mirror

Useful for checking exact node names.

YOLO / detection models:

Bingsu/adetailer

Common source for face/person/clothing detection models.

Ultralytics assets

General Ultralytics model assets.

Workflow examples / wiring examples:

FaceDetailerPipe workflow index

I would not necessarily use FaceDetailer to redraw the face here, but these workflows can be useful for learning the YOLO / detector / pipe wiring.

ComfyUI Face Detailer guide

Again, I would treat this mainly as a guide to the detector/detailer ecosystem, not as the first thing to use for identity preservation.

Improving faces with Impact-Pack Detailers

Useful background for Impact Pack detailer workflows.

Crop and stitch:

ComfyUI-Inpaint-CropAndStitch

This is probably one of the most relevant pieces for this use case.

Comfy-Org crop-and-stitch nodes

Same general idea: crop before sampling, stitch back afterward, preserve unmasked areas.

RunComfy: ComfyUI-Inpaint-CropAndStitch

Readable node overview.

RunComfy: Inpaint Crop

Useful if you want the specific node page.

Optional second stage

Only after the mask workflow works, I would test extra control.

If normal SDXL inpaint is not good enough:

Option 1: Fooocus-style inpaint support

Acly/comfyui-inpaint-nodes

This can add Fooocus / LaMa / MAT-style inpaint tools to ComfyUI.

RunComfy: ComfyUI Inpaint Nodes

Readable guide for the Acly inpaint nodes.

Fooocus inpaint files

Fooocus inpaint model files.

Fooocus inpaint patch

Specific Fooocus inpaint patch file.

Option 2: SDXL inpainting model

SDXL Inpainting 0.1

This is a dedicated SDXL inpainting model.

Option 3: ControlNet

I would treat ControlNet as a second-stage fix, not the first fix.

First solve the mask design.

Then:

Tile ControlNet = if texture/clothing/background changes too much
Canny ControlNet = if outlines drift too much
Inpaint ControlNet = if fill/boundary quality is poor
Union ControlNet = more advanced route, more modes, more complexity

Possible references:

ControlNet Canny SDXL

Xinsir ControlNet Canny SDXL

Xinsir ControlNet Tile SDXL

Xinsir ControlNet Union SDXL

Xinsir ControlNetPlus GitHub

controlnetXL_inpaint

controlnet-inpaint-dreamer-sdxl

Diffusers ControlNet with SDXL docs

I would not start with all of these.

For this case, I would probably test in this order:

1. face/person-protect mask without ControlNet
2. Crop & Stitch
3. Tile ControlNet if texture changes too much
4. Canny ControlNet if shape drifts too much
5. inpaint ControlNet or Fooocus inpaint if fill quality is weak

Optional third stage

If you still need stronger identity preservation, then I would start looking at identity-reference systems.

But I would not start there.

Only after the simpler mask route fails would I look at things like:

IPAdapter FaceID
InstantID
LivePortrait
manual mask correction

The reason is simple:

the first problem to solve is not "how do I force SDXL to know the person?"
the first problem is "how do I stop SDXL from touching the person?"

Final rule before I2V

Only feed the PNG to I2V after the still image still looks like the same person.

If the still-image prep already changes the person, I2V cannot fix that.

It will just animate the changed person.

So the diagnostic order should be:

1. Can I make a fixed still PNG that preserves identity?
2. Does that fixed PNG animate better than the original frame?
3. Only then tune the I2V settings.

For your specific failure mode, I would debug the still-image prep first.

Not the Wan settings first.