External Publication

How to convert a single safetensors file to PEFT format

Hugging Face Forums [Unofficial] March 16, 2026

Oh! I’ve rewritten it for Qwen-Image. From what I’ve tested so far, it seems that tensors with keys other than mlp* can be converted. However, it’s unclear whether LoRA will actually work with the converter below…

Qwen-Image support is mostly a naming/prefix problem.

vLLM-Omni diffusion LoRAs must be a PEFT adapter directory (adapter_config.json + adapter_model.safetensors). (vLLM) vLLM is strict about module-name suffixes and PEFT key naming, and it breaks on *.to_out.0.* unless you normalize it to *.to_out.*. (GitHub) For Qwen-Image specifically, the pipeline loads transformer weights under a transformer. prefix, and the pipeline has a self.transformer = QwenImageTransformer2DModel(...). (GitHub) The Qwen-Image transformer also exposes packed projection shard mappings and normalizes .to_out.0. → .to_out. when loading weights. (GitHub)

Below is a rewritten version of the gist that adds a Qwen-Image converter for ComfyUI-style keys like:

transformer_blocks.N.attn.to_q.lora_down.weight

It converts them into PEFT keys like:

base_model.model.transformer.transformer_blocks.N.attn.to_q.lora_A.weight

Rewritten script (drop-in, supports Qwen-Image)

#!/usr/bin/env python3
"""
comfyui-to-vllm-omni-qwenimage.py

Convert ComfyUI-style Qwen-Image LoRA safetensors (lora_down/lora_up) into a PEFT
adapter folder accepted by vLLM-Omni diffusion LoRA loader.

Why this works:
- vLLM-Omni requires PEFT adapter directory format. (adapter_config.json + adapter_model.safetensors)
  https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/lora/
- vLLM expects lora_A/lora_B naming; ComfyUI uses lora_down/lora_up.
- vLLM has a known failure for ModuleList/Sequential numeric indices like "to_out.0".
  Fix by rewriting to "to_out". https://github.com/vllm-project/vllm/issues/35734
- Qwen-Image pipeline loads transformer weights with prefix "transformer." and defines self.transformer.
  https://raw.githubusercontent.com/vllm-project/vllm-omni/main/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py
- Qwen-Image transformer exposes packed shard mapping and normalizes ".to_out.0." -> ".to_out." in load_weights.
  https://raw.githubusercontent.com/vllm-project/vllm-omni/main/vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py
"""

import argparse
import json
import re
import sys
from pathlib import Path

import torch
from safetensors.torch import load_file, save_file


# -------------------------
# Qwen-Image settings
# -------------------------

# vLLM strips "base_model.model." internally, and Qwen-Image modules live under "transformer.*"
# (pipeline uses prefix="transformer." and assigns self.transformer=QwenImageTransformer2DModel)
PREFIX_QWEN = "base_model.model.transformer."

# Attention-only by default (recommended). You can optionally include MLP keys with --include-mlp.
ALLOWED_QWEN_PREFIXES_ATTN = (
    "attn.to_q",
    "attn.to_k",
    "attn.to_v",
    "attn.to_out",
    "attn.add_q_proj",
    "attn.add_k_proj",
    "attn.add_v_proj",
    "attn.to_add_out",  # present in Qwen-Image-Lightning
)

# Optional MLP keys observed in Qwen-Image-Lightning (ComfyUI-style)
ALLOWED_QWEN_PREFIXES_MLP = (
    "img_mlp.net.0.proj",
    "img_mlp.net.2",
    "txt_mlp.net.0.proj",
    "txt_mlp.net.2",
)

# PEFT config fields vLLM-Omni documents as important: r, lora_alpha, target_modules, base_model_name_or_path
# https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/lora/
QWEN_TARGET_MODULES_ATTN = [
    "to_q", "to_k", "to_v", "to_out",
    "add_q_proj", "add_k_proj", "add_v_proj",
    "to_add_out",
    # packed names are fine to include even if unused:
    "to_qkv", "add_kv_proj",
]

# If you include MLP keys, vLLM will validate suffixes against expected modules.
# net.2 can be tricky; keep it optional.
QWEN_TARGET_MODULES_MLP = [
    "proj",
    # caution: module suffix may be "2" for net.2; only enable if your vLLM-Omni build expects it
    "2",
]

ADAPTER_CONFIG_TEMPLATE = {
    "peft_type": "LORA",
    "bias": "none",
    "inference_mode": True,
    "lora_dropout": 0.0,
    "r": None,
    "lora_alpha": None,
    "target_modules": None,
    "base_model_name_or_path": None,
}


# -------------------------
# Helpers
# -------------------------

def _remap_direction(direction: str) -> str:
    """lora_down -> lora_A, lora_up -> lora_B"""
    if direction == "lora_down":
        return "lora_A"
    if direction == "lora_up":
        return "lora_B"
    return direction


def _normalize_modulelist_indices(frag: str) -> str:
    """
    Fix vLLM numeric-index issue:
      attn.to_out.0 -> attn.to_out
    Similar normalization exists in Qwen-Image transformer's load_weights. (see qwen_image_transformer.py)
    """
    frag = frag.replace("attn.to_out.0", "attn.to_out")
    frag = frag.replace("attn.to_add_out.0", "attn.to_add_out")
    return frag


def detect_format(keys: list[str]) -> str:
    sample = [k for k in keys if not k.endswith(".alpha")][:50]
    # Qwen-Image-Lightning (ComfyUI style) looks like:
    # transformer_blocks.N.attn.to_q.lora_down.weight
    if any(re.match(r"^transformer_blocks\.\d+\..+\.(lora_down|lora_up)\.weight$", k) for k in sample):
        return "qwen_transformer_blocks_comfyui"
    return "unknown"


def extract_rank_and_alpha(tensors: dict[str, torch.Tensor]) -> tuple[int, float]:
    alpha = None
    for k, v in tensors.items():
        if k.endswith(".alpha"):
            try:
                alpha = float(v.item())
                break
            except Exception:
                pass

    r = None
    for k, v in tensors.items():
        if k.endswith(".lora_down.weight") and hasattr(v, "shape"):
            r = int(v.shape[0])
            break

    if r is None:
        raise ValueError("Could not infer LoRA rank r. Provide --rank.")
    if alpha is None:
        alpha = float(r)
    return r, alpha


# -------------------------
# Converter: Qwen-Image transformer_blocks.* (ComfyUI lora_down/lora_up)
# -------------------------

def convert_qwen_transformer_blocks_comfyui(
    tensors: dict[str, torch.Tensor],
    include_mlp: bool,
    dtype: torch.dtype,
) -> tuple[dict[str, torch.Tensor], list[str]]:
    out: dict[str, torch.Tensor] = {}
    unmapped: list[str] = []

    allowed_prefixes = ALLOWED_QWEN_PREFIXES_ATTN + (ALLOWED_QWEN_PREFIXES_MLP if include_mlp else ())

    pat = re.compile(r"^transformer_blocks\.(\d+)\.(.+?)\.(lora_down|lora_up)\.weight$")

    for k, v in tensors.items():
        if k.endswith(".alpha"):
            continue

        m = pat.match(k)
        if not m:
            unmapped.append(k)
            continue

        block_idx = int(m.group(1))
        frag = _normalize_modulelist_indices(m.group(2))
        direction = m.group(3)

        if not frag.startswith(allowed_prefixes):
            unmapped.append(k)
            continue

        ab = _remap_direction(direction)
        new_key = f"{PREFIX_QWEN}transformer_blocks.{block_idx}.{frag}.{ab}.weight"

        if v.dtype != dtype:
            v = v.to(dtype)
        out[new_key] = v

    # Final safety: remove any leftover ".to_out.0." in full key
    fixed: dict[str, torch.Tensor] = {}
    for k, v in out.items():
        nk = k.replace(".to_out.0.", ".to_out.").replace(".to_add_out.0.", ".to_add_out.")
        fixed[nk] = v

    return fixed, unmapped


# -------------------------
# Main
# -------------------------

def main():
    ap = argparse.ArgumentParser("Convert ComfyUI Qwen-Image LoRA -> vLLM-Omni PEFT adapter dir")
    ap.add_argument("--input", required=True, help="Input LoRA .safetensors")
    ap.add_argument("--output", required=True, help="Output adapter directory")
    ap.add_argument("--base-model", default="Qwen/Qwen-Image", help="base_model_name_or_path in adapter_config.json")
    ap.add_argument("--dtype", choices=["bf16", "fp16", "fp32"], default="bf16")
    ap.add_argument("--include-mlp", action="store_true", help="Also convert img_mlp/txt_mlp LoRA keys (may fail if vLLM expects different suffixes)")
    args = ap.parse_args()

    dtype_map = {"bf16": torch.bfloat16, "fp16": torch.float16, "fp32": torch.float32}
    out_dtype = dtype_map[args.dtype]

    in_path = Path(args.input)
    if not in_path.exists():
        sys.exit(f"[ERROR] Input not found: {in_path}")

    print(f"[INFO] Loading: {in_path}")
    tensors = load_file(str(in_path))
    keys = list(tensors.keys())

    fmt = detect_format(keys)
    print(f"[INFO] Detected format: {fmt}")
    if fmt != "qwen_transformer_blocks_comfyui":
        sys.exit(
            "[ERROR] This rewrite currently targets Qwen-Image ComfyUI keys like:\n"
            "  transformer_blocks.N.attn.to_q.lora_down.weight\n"
            "If your keys differ, paste 30 keys and adjust detect_format/regex."
        )

    r, alpha = extract_rank_and_alpha(tensors)
    print(f"[INFO] Inferred r={r}, lora_alpha={alpha}")

    converted, unmapped = convert_qwen_transformer_blocks_comfyui(
        tensors=tensors,
        include_mlp=args.include_mlp,
        dtype=out_dtype,
    )

    print(f"[INFO] Converted tensors: {len(converted)}")
    if unmapped:
        print(f"[WARN] Unmapped keys: {len(unmapped)} (showing first 20)")
        for k in unmapped[:20]:
            print("   ", k)

    out_dir = Path(args.output)
    out_dir.mkdir(parents=True, exist_ok=True)

    cfg = dict(ADAPTER_CONFIG_TEMPLATE)
    cfg["r"] = int(r)
    cfg["lora_alpha"] = float(alpha)
    cfg["base_model_name_or_path"] = args.base_model
    cfg["target_modules"] = (
        QWEN_TARGET_MODULES_ATTN + (QWEN_TARGET_MODULES_MLP if args.include_mlp else [])
    )

    (out_dir / "adapter_config.json").write_text(json.dumps(cfg, indent=2), encoding="utf-8")
    save_file(converted, str(out_dir / "adapter_model.safetensors"))

    print(f"[DONE] Wrote PEFT adapter dir: {out_dir}")
    print("       - adapter_config.json")
    print("       - adapter_model.safetensors")


if __name__ == "__main__":
    main()

Usage (for Qwen-Image-Lightning)

python comfyui-to-vllm-omni-qwenimage.py \
  --input Qwen-Image-Lightning-8steps-V2.0-bf16.safetensors \
  --output ./out_adapter \
  --dtype bf16 \
  --base-model Qwen/Qwen-Image

Why this matches Qwen-Image in vLLM-Omni

It writes LoRA keys under ...transformer... which aligns with Qwen-Image pipeline weight source prefix prefix="transformer." and self.transformer = QwenImageTransformer2DModel(...). (GitHub)
It keeps to_q/to_k/to_v and add_q_proj/add_k_proj/add_v_proj, which align with Qwen-Image transformer packed shard mapping (to_qkv shards and add_kv_proj shards). (GitHub)
It normalizes to_out.0 to to_out to avoid the known vLLM numeric-index LoRA failure. (GitHub)
It outputs the PEFT adapter folder vLLM-Omni requires. (vLLM)

Rewritten script (drop-in, supports Qwen-Image)

Usage (for Qwen-Image-Lightning)

Why this matches Qwen-Image in vLLM-Omni

Discussion in the ATmosphere