Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidqgd5bzrivaeqspo5hncvnp7s5b5vzbfcu5ef3uuzk64rzdinlnu",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3ml3omlbro7o2"
  },
  "path": "/t/1st-movie-clip/175306#post_14",
  "publishedAt": "2026-05-05T05:18:42.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "Wan2.2 Video Generation ComfyUI Official Native Workflow Example",
    "LoRA Loader",
    "LoraLoaderModelOnly",
    "WanFunInpaintToVideo node docs",
    "ComfyUI frame interpolation workflow",
    "ComfyUI-ReActor",
    "ComfyUI Impact Pack",
    "ComfyUI IPAdapter Plus",
    "ComfyUI Wan FLF workflow",
    "ComfyUI Inpainting Workflow",
    "Wan VACE To Video node docs"
  ],
  "textContent": "I think the challenge is just too hard… It’s on hard mode from the very start, after all.\n\n* * *\n\n# This is how I would think about your setup now\n\nFirst: switching from the direct Desktop install to **ComfyUI Portable** and suddenly having nodes/Manager behave properly is a real clue, not a coincidence. It strongly suggests the earlier problems were environmental rather than “you not understanding ComfyUI.” That is common with custom-node ecosystems: the install is only truly healthy when the node location and Python environment line up properly.\n\nThe good news is that you are now **past** the hardest beginner wall.\n\nYour current Wan 2.2 setup already does something valuable:\n\n  * it generates clips reliably\n  * you understand the main nodes\n  * you can use positive/negative conditioning\n  * you can apply one or more LoRAs\n  * you can do first-frame workflows\n  * you can do first-frame → last-frame workflows\n\n\n\nThat means the main questions are no longer:\n\n  * “How do I make anything at all?”\n  * “Why won’t the nodes load?”\n\n\n\nYour real questions now are more advanced and more interesting:\n\n  1. Why does a classic **checkpoint node** not seem to fit the Wan graph?\n  2. Why does **lower FPS** make drift look worse, and what should you do about it?\n  3. Why do ordinary inpainting tutorials not solve **“take this bad frame and fix the face using another face image”**?\n\n\n\nThose three are connected.\n\n* * *\n\n## The short answer\n\nIf I had to compress the whole answer into one paragraph, it would be this:\n\n**Keep your Wan 2.2 workflow as your main shot generator. Do not force a classic SD-style checkpoint loader into the native Wan graph. Treat FPS as a quality/time tradeoff, not as a magic identity fix. Use FLF for the sit-down transition. And for face repair, stop thinking “text-only inpaint” and start thinking “separate still-frame repair workflow using either plain masked face inpaint, ReActor face swap, or mask-local face repair/detailing with a reference-guided method.”**\n\nThat is the cleanest mental model.\n\n* * *\n\n## 1) About the checkpoint node\n\n### Short version\n\nIn a **native Wan 2.2** workflow, you normally do **not** insert a classic SD/SDXL-style checkpoint node.\n\n### Why\n\nThe official Wan 2.2 ComfyUI workflow is not structured like a classic Stable Diffusion workflow where one checkpoint node loads most of the system in one go.\n\nInstead, the official Wan-native flow is built from separate components, typically:\n\n  * diffusion model loader\n  * CLIP loader\n  * VAE loader\n  * the Wan video node itself\n  * LoRA loader(s)\n  * conditioning nodes\n\n\n\nSee:\n\n  * Wan2.2 Video Generation ComfyUI Official Native Workflow Example\n\n\n\n### What that means for your graph\n\nIf your current graph already looks something like:\n\n  * `Load Diffusion Model`\n  * `Load CLIP`\n  * `Load VAE`\n  * one or more LoRA nodes\n  * positive / negative conditioning\n  * Wan image-to-video or first/last-frame node\n  * decode / save\n\n\n\nthen you are already using the **correct native loading pattern**.\n\nSo the reason you “can’t figure out how to include a checkpoint node” is probably not that you are missing something. It is more likely that **there is no natural slot for a classic checkpoint node in the native Wan graph**.\n\n### Where a checkpoint loader _does_ make sense\n\nA classic checkpoint loader **can** make sense in a **separate still-image repair workflow**.\n\nFor example, if you later build a dedicated face-repair graph using:\n\n  * a still-image inpaint model,\n  * a checkpoint-based image model,\n  * or an SDXL/Flux-style repair branch,\n\n\n\nthen _that_ separate graph may use a checkpoint node.\n\nBut that would be its own repair workflow, not something you must squeeze into the Wan graph itself.\n\n### About your LoRA chain\n\nYour current LoRA logic sounds fine.\n\nRelevant docs:\n\n  * LoRA Loader\n  * LoraLoaderModelOnly\n\n\n\nImportant points from those docs:\n\n  * LoRAs are discovered from `ComfyUI/models/loras`\n  * multiple LoRA nodes can be chained directly\n  * `LoraLoaderModelOnly` is specifically for applying LoRAs to the **model branch only** , without needing a CLIP model input on that node\n\n\n\nThat is why LoRA chaining feels natural in your current setup, while a classic checkpoint node does not.\n\n### My practical recommendation\n\nFor your Wan graph:\n\n  * **do not force a classic checkpoint loader into it**\n  * keep the native Wan structure\n  * only use checkpoint-based loading in a **separate repair graph** if you later choose a checkpoint-based still-image repair method\n\n\n\n* * *\n\n## 2) About FPS, drift, and render time\n\nYou noticed:\n\n  * lower FPS = more visible drift\n  * higher FPS = drift feels less noticeable\n  * but higher FPS = much longer generation time\n\n\n\nThat observation is useful, and it makes sense.\n\n### Why higher FPS often looks better\n\nHigher FPS does **not necessarily mean the model suddenly understands identity better**.\n\nWhat it often means is:\n\n  * each frame is closer to the next in time\n  * motion is split into smaller steps\n  * the changes between frames feel less abrupt\n  * the drift becomes less obvious because the motion is smoother\n\n\n\nSo the model may still be drifting, but the drift is **hidden better** by finer temporal spacing.\n\n### Why this becomes expensive quickly\n\nThe cost scales with frame count.\n\nThe official ComfyUI docs for Wan/Fun Inp make this very explicit: video `length` is the **total number of frames** , and the example calculation is basically:\n\n  * `seconds × fps = frame count`\n\n\n\nSo if you double FPS while keeping the duration the same, you roughly double the number of frames the system has to generate.\n\nSee:\n\n  * WanFunInpaintToVideo node docs\n  * Wan2.2 Video Generation ComfyUI Official Native Workflow Example\n\n\n\n### The important production lesson\n\nOn 8 GB VRAM, I would **not** make native 24 FPS your default unless you truly need it.\n\nThat is because your real bottleneck is not “video exists or not.” It is:\n\n  * quality per minute of render time\n  * how many iterations you can afford\n  * whether you can keep enough control over continuity\n\n\n\n### A better 8 GB strategy\n\nInstead of brute-forcing everything at native 24 FPS, I would bias toward:\n\n  1. **shorter clips**\n  2. **moderate native FPS**\n  3. **frame interpolation later** , when needed\n\n\n\nThe official ComfyUI frame interpolation workflow exists for exactly this reason.\n\nSee:\n\n  * ComfyUI frame interpolation workflow\n\n\n\nThat page is very relevant because it explicitly says frame interpolation:\n\n  * generates intermediate frames\n  * smooths motion\n  * improves temporal consistency\n  * is useful for increasing frame rate in short clips\n  * is useful for fixing low-FPS generations without regenerating the source frames\n\n\n\n### My practical recommendation\n\nFor your current setup I would test this order:\n\n  * keep clips short\n  * use a sensible native frame count\n  * use stronger control (first frame, first→last frame)\n  * only then use interpolation for smoother output\n\n\n\nThat is usually a better quality/time tradeoff than forcing 24 FPS generation everywhere.\n\n* * *\n\n## 3) Why the inpainting tutorials feel like they stop one step too early\n\nThis is the part causing the most confusion, and for good reason.\n\n### What those tutorials are really teaching\n\nThe standard inpainting tutorials teach:\n\n  * load an image\n  * draw a mask\n  * use text conditioning\n  * regenerate only the masked region\n\n\n\nThat is **generic inpainting**.\n\nAnd yes, that is why:\n\n  * teapot example works\n  * cloud/hair example works\n  * but your actual problem still feels unsolved\n\n\n\nBecause your actual problem is **not** :\n\n> replace this masked region with any plausible thing described by text\n\nYour actual problem is:\n\n> keep this bad frame as the base image, keep the pose/lighting/composition, and make the masked face look like the **correct person from another image**\n\nThat is a different task.\n\n### The missing concept\n\nYou are not supposed to put the second face image “onto the canvas” like another background layer.\n\nInstead:\n\n  * the **broken frame** remains the base image\n  * the **mask** defines the region to repair\n  * the **second face image** enters the graph as a **reference / swap source / identity guide**\n  * a repair node uses that second image to influence what happens _inside the mask_\n\n\n\nThat is the key mental shift.\n\n* * *\n\n## 4) So what are the actual ways to use a second face image?\n\nThere are three practical families.\n\n### A. Face swap: the direct route\n\nThis is the ReActor route.\n\nUse it when:\n\n  * the frame is already good\n  * the face became the wrong person\n  * the pose, lighting, clothes, and framing are acceptable\n\n\n\nRelevant repo:\n\n  * ComfyUI-ReActor\n\n\n\nWhy it is relevant:\n\n  * it is explicitly a face-swap extension for ComfyUI\n  * it supports reusable face models\n  * it is designed for image inputs and is very naturally suited to “fix this bad frame”\n\n\n\nIn plain language, the workflow is:\n\n  * `input_image` = broken frame\n  * `source_image` or `face_model` = the correct identity\n  * output = repaired frame\n\n\n\nThat is probably the **closest** direct answer to your actual question.\n\n### B. Local face repair/detailing: the practical fallback\n\nThis is the Impact Pack route.\n\nRelevant repo:\n\n  * ComfyUI Impact Pack\n\n\n\nImportant nodes:\n\n  * `MaskPainter` — draw the mask\n  * `FaceDetailer` — detect faces and improve them\n  * `MaskDetailer` — inpaint only the masked area with a detailer pass\n\n\n\nWhy it is relevant:\n\n  * it matches the “keep the frame, only fix the face” logic very well\n  * it is a great fallback if ReActor is awkward or not the right fit\n  * it is especially useful if the face is not just the wrong person but also a bit damaged, blurry, or structurally off\n\n\n\n### C. Reference-guided identity repair: the most conceptually accurate route\n\nThis is the IPAdapter FaceID-style idea.\n\nRelevant repo:\n\n  * ComfyUI IPAdapter Plus\n\n\n\nWhy it is relevant:\n\n  * this is the clearest answer to “how do I use a second image to guide the face repair?”\n  * the second face image becomes an identity reference, not just a prompt substitute\n  * the docs emphasize that regional use is most effective through an inpainting workflow\n\n\n\nThis route is powerful, but it is more setup-heavy than the other two.\n\n* * *\n\n## 5) My actual recommendation for your case\n\nIf this were my setup, I would not try to solve everything inside one giant graph.\n\nI would deliberately split the work into **two workflows**.\n\n* * *\n\n## Workflow A — the main Wan video workflow\n\nThis is your existing graph.\n\nKeep it for:\n\n  * image/text/video generation\n  * positive / negative prompt control\n  * LoRAs\n  * first-frame workflows\n  * first-frame → last-frame workflows\n\n\n\nThis is your **shot generator**.\n\nRelevant docs:\n\n  * Wan2.2 Video Generation ComfyUI Official Native Workflow Example\n  * ComfyUI Wan FLF workflow\n\n\n\n* * *\n\n## Workflow B — the separate still-frame repair workflow\n\nThis is the graph you use when a shot finishes and the last frame is _almost_ right, but the face is not.\n\nUse it for:\n\n  * loading the broken frame\n  * masking only the face\n  * repairing that face with one of:\n    * plain inpaint\n    * ReActor\n    * Impact Pack\n    * reference-guided identity repair\n\n\n\nThen save the repaired frame and feed it back into the next Wan shot.\n\nThis is your **continuity repair tool**.\n\nThat split is extremely important.\n\n### Why I recommend two workflows\n\nBecause it gives each graph one clear job:\n\n  * **Workflow A** creates shots\n  * **Workflow B** repairs bridge frames\n\n\n\nThat is much easier to understand and much easier to debug than an all-in-one “do everything” workflow.\n\n* * *\n\n## 6) Repair vs recreate: the rule that will save you the most time\n\nThis is the rule I would use.\n\n### Repair when:\n\n  * the frame is already mostly good\n  * the body pose is right\n  * the lighting is right\n  * the composition is right\n  * the background / bench is right\n  * only the face or a tiny area drifted\n\n\n\n### Recreate when:\n\n  * the pose is wrong\n  * the camera is wrong\n  * the sit-down motion is wrong\n  * multiple frames in a row are bad\n  * fixing the face would still leave the shot unusable\n\n\n\nFor your project, that usually means:\n\n  * **walk** : repair the last frame if only the face drifted\n  * **approach bench** : same\n  * **sit-down transition** : usually **recreate** with FLF, not patch frame-by-frame\n  * **seated shot** : repair isolated face drift, recreate bad staging\n\n\n\nThis is the production logic I would trust.\n\n* * *\n\n## 7) The exact answer to “what am I doing wrong?”\n\nI do not think you are doing the wrong _operation_.\n\nI think you are trying to solve a **reference-guided identity repair** task with a **text-only generic inpainting tutorial**.\n\nThat is the mismatch.\n\nYou are not failing because you do not understand masking.\n\nYou are failing because your actual task needs one more ingredient:\n\n  * a **second-image identity mechanism**\n\n\n\nThat mechanism is usually:\n\n  * **ReActor** as a swap source\n  * or **IPAdapter FaceID** as a reference-conditioning source\n  * or **Impact Pack** as a local face repair route when the face is damaged and needs controlled correction\n\n\n\nThat is the real missing piece.\n\n* * *\n\n## 8) The most practical next steps I would recommend\n\n### Step 1\n\nKeep your current Wan graph as your **main shot workflow**.\n\n### Step 2\n\nStop worrying about forcing a classic checkpoint node into that graph.\n\n### Step 3\n\nCreate a separate **frame repair workflow**.\n\nStart with the official inpaint tutorial just to make sure the mask/local-rebuild logic is completely clear:\n\n  * ComfyUI Inpainting Workflow\n\n\n\nThen decide:\n\n  * slight face drift → plain masked face inpaint\n  * wrong person → ReActor\n  * local face repair / fallback → Impact Pack\n  * strong second-image identity guidance → IPAdapter FaceID-style route\n\n\n\n### Step 4\n\nUse FLF for your sit-down transition.\n\nThat is a cleaner solution than trying to patch a bad sit-down clip.\n\n### Step 5\n\nDo not default to native 24 FPS everywhere.\nTest:\n\n  * short clips\n  * moderate native FPS\n  * interpolation later\n\n\n\nSee:\n\n  * ComfyUI frame interpolation workflow\n\n\n\n* * *\n\n## 9) The one-sentence summary\n\n**Your Wan workflow should generate shots; your separate repair workflow should fix bad bridge frames using a mask plus either a swap source, a face-detail pass, or a second-image identity guide.**\n\nThat is the cleanest way to make the whole thing click.\n\n* * *\n\n## 10) The exact links I would keep open\n\nMain generation:\n\n  * Wan2.2 Video Generation ComfyUI Official Native Workflow Example\n  * ComfyUI Wan FLF workflow\n\n\n\nRepair / continuity:\n\n  * ComfyUI Inpainting Workflow\n  * ComfyUI-ReActor\n  * ComfyUI Impact Pack\n  * ComfyUI IPAdapter Plus\n\n\n\nLater / heavier options:\n\n  * WanFunInpaintToVideo node docs\n  * Wan VACE To Video node docs\n\n\n\nUtilities:\n\n  * ComfyUI frame interpolation workflow\n\n\n\n* * *\n\n## Final recommendation\n\nIf this were my setup, I would do this:\n\n  1. **keep** the current Wan 2.2 graph\n  2. **do not** add a classic checkpoint loader to it\n  3. build one separate **still-frame repair graph**\n  4. use that graph to fix bad **bridge frames**\n  5. use **FLF** for the sit-down transition\n  6. use **frame interpolation** instead of brute-forcing native 24 FPS everywhere\n  7. only later consider heavier clip-editing or training workflows\n\n\n\nThat is the simplest, cleanest, least frustrating path from where you are now.",
  "title": "1st movie clip!"
}