How to convert a single safetensors file to PEFT format
Oh! I’ve rewritten it for Qwen-Image. From what I’ve tested so far, it seems that tensors with keys other than mlp* can be converted. However, it’s unclear whether LoRA will actually work with the converter below…
Qwen-Image support is mostly a naming/prefix problem.
vLLM-Omni diffusion LoRAs must be a PEFT adapter directory (adapter_config.json + adapter_model.safetensors). (vLLM)
vLLM is strict about module-name suffixes and PEFT key naming, and it breaks on *.to_out.0.* unless you normalize it to *.to_out.*. (GitHub)
For Qwen-Image specifically, the pipeline loads transformer weights under a transformer. prefix, and the pipeline has a self.transformer = QwenImageTransformer2DModel(...). (GitHub)
The Qwen-Image transformer also exposes packed projection shard mappings and normalizes .to_out.0. → .to_out. when loading weights. (GitHub)
Below is a rewritten version of the gist that adds a Qwen-Image converter for ComfyUI-style keys like:
transformer_blocks.N.attn.to_q.lora_down.weight
It converts them into PEFT keys like:
base_model.model.transformer.transformer_blocks.N.attn.to_q.lora_A.weight
Rewritten script (drop-in, supports Qwen-Image)
#!/usr/bin/env python3
"""
comfyui-to-vllm-omni-qwenimage.py
Convert ComfyUI-style Qwen-Image LoRA safetensors (lora_down/lora_up) into a PEFT
adapter folder accepted by vLLM-Omni diffusion LoRA loader.
Why this works:
- vLLM-Omni requires PEFT adapter directory format. (adapter_config.json + adapter_model.safetensors)
https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/lora/
- vLLM expects lora_A/lora_B naming; ComfyUI uses lora_down/lora_up.
- vLLM has a known failure for ModuleList/Sequential numeric indices like "to_out.0".
Fix by rewriting to "to_out". https://github.com/vllm-project/vllm/issues/35734
- Qwen-Image pipeline loads transformer weights with prefix "transformer." and defines self.transformer.
https://raw.githubusercontent.com/vllm-project/vllm-omni/main/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py
- Qwen-Image transformer exposes packed shard mapping and normalizes ".to_out.0." -> ".to_out." in load_weights.
https://raw.githubusercontent.com/vllm-project/vllm-omni/main/vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py
"""
import argparse
import json
import re
import sys
from pathlib import Path
import torch
from safetensors.torch import load_file, save_file
# -------------------------
# Qwen-Image settings
# -------------------------
# vLLM strips "base_model.model." internally, and Qwen-Image modules live under "transformer.*"
# (pipeline uses prefix="transformer." and assigns self.transformer=QwenImageTransformer2DModel)
PREFIX_QWEN = "base_model.model.transformer."
# Attention-only by default (recommended). You can optionally include MLP keys with --include-mlp.
ALLOWED_QWEN_PREFIXES_ATTN = (
"attn.to_q",
"attn.to_k",
"attn.to_v",
"attn.to_out",
"attn.add_q_proj",
"attn.add_k_proj",
"attn.add_v_proj",
"attn.to_add_out", # present in Qwen-Image-Lightning
)
# Optional MLP keys observed in Qwen-Image-Lightning (ComfyUI-style)
ALLOWED_QWEN_PREFIXES_MLP = (
"img_mlp.net.0.proj",
"img_mlp.net.2",
"txt_mlp.net.0.proj",
"txt_mlp.net.2",
)
# PEFT config fields vLLM-Omni documents as important: r, lora_alpha, target_modules, base_model_name_or_path
# https://docs.vllm.ai/projects/vllm-omni/en/latest/user_guide/diffusion/lora/
QWEN_TARGET_MODULES_ATTN = [
"to_q", "to_k", "to_v", "to_out",
"add_q_proj", "add_k_proj", "add_v_proj",
"to_add_out",
# packed names are fine to include even if unused:
"to_qkv", "add_kv_proj",
]
# If you include MLP keys, vLLM will validate suffixes against expected modules.
# net.2 can be tricky; keep it optional.
QWEN_TARGET_MODULES_MLP = [
"proj",
# caution: module suffix may be "2" for net.2; only enable if your vLLM-Omni build expects it
"2",
]
ADAPTER_CONFIG_TEMPLATE = {
"peft_type": "LORA",
"bias": "none",
"inference_mode": True,
"lora_dropout": 0.0,
"r": None,
"lora_alpha": None,
"target_modules": None,
"base_model_name_or_path": None,
}
# -------------------------
# Helpers
# -------------------------
def _remap_direction(direction: str) -> str:
"""lora_down -> lora_A, lora_up -> lora_B"""
if direction == "lora_down":
return "lora_A"
if direction == "lora_up":
return "lora_B"
return direction
def _normalize_modulelist_indices(frag: str) -> str:
"""
Fix vLLM numeric-index issue:
attn.to_out.0 -> attn.to_out
Similar normalization exists in Qwen-Image transformer's load_weights. (see qwen_image_transformer.py)
"""
frag = frag.replace("attn.to_out.0", "attn.to_out")
frag = frag.replace("attn.to_add_out.0", "attn.to_add_out")
return frag
def detect_format(keys: list[str]) -> str:
sample = [k for k in keys if not k.endswith(".alpha")][:50]
# Qwen-Image-Lightning (ComfyUI style) looks like:
# transformer_blocks.N.attn.to_q.lora_down.weight
if any(re.match(r"^transformer_blocks\.\d+\..+\.(lora_down|lora_up)\.weight$", k) for k in sample):
return "qwen_transformer_blocks_comfyui"
return "unknown"
def extract_rank_and_alpha(tensors: dict[str, torch.Tensor]) -> tuple[int, float]:
alpha = None
for k, v in tensors.items():
if k.endswith(".alpha"):
try:
alpha = float(v.item())
break
except Exception:
pass
r = None
for k, v in tensors.items():
if k.endswith(".lora_down.weight") and hasattr(v, "shape"):
r = int(v.shape[0])
break
if r is None:
raise ValueError("Could not infer LoRA rank r. Provide --rank.")
if alpha is None:
alpha = float(r)
return r, alpha
# -------------------------
# Converter: Qwen-Image transformer_blocks.* (ComfyUI lora_down/lora_up)
# -------------------------
def convert_qwen_transformer_blocks_comfyui(
tensors: dict[str, torch.Tensor],
include_mlp: bool,
dtype: torch.dtype,
) -> tuple[dict[str, torch.Tensor], list[str]]:
out: dict[str, torch.Tensor] = {}
unmapped: list[str] = []
allowed_prefixes = ALLOWED_QWEN_PREFIXES_ATTN + (ALLOWED_QWEN_PREFIXES_MLP if include_mlp else ())
pat = re.compile(r"^transformer_blocks\.(\d+)\.(.+?)\.(lora_down|lora_up)\.weight$")
for k, v in tensors.items():
if k.endswith(".alpha"):
continue
m = pat.match(k)
if not m:
unmapped.append(k)
continue
block_idx = int(m.group(1))
frag = _normalize_modulelist_indices(m.group(2))
direction = m.group(3)
if not frag.startswith(allowed_prefixes):
unmapped.append(k)
continue
ab = _remap_direction(direction)
new_key = f"{PREFIX_QWEN}transformer_blocks.{block_idx}.{frag}.{ab}.weight"
if v.dtype != dtype:
v = v.to(dtype)
out[new_key] = v
# Final safety: remove any leftover ".to_out.0." in full key
fixed: dict[str, torch.Tensor] = {}
for k, v in out.items():
nk = k.replace(".to_out.0.", ".to_out.").replace(".to_add_out.0.", ".to_add_out.")
fixed[nk] = v
return fixed, unmapped
# -------------------------
# Main
# -------------------------
def main():
ap = argparse.ArgumentParser("Convert ComfyUI Qwen-Image LoRA -> vLLM-Omni PEFT adapter dir")
ap.add_argument("--input", required=True, help="Input LoRA .safetensors")
ap.add_argument("--output", required=True, help="Output adapter directory")
ap.add_argument("--base-model", default="Qwen/Qwen-Image", help="base_model_name_or_path in adapter_config.json")
ap.add_argument("--dtype", choices=["bf16", "fp16", "fp32"], default="bf16")
ap.add_argument("--include-mlp", action="store_true", help="Also convert img_mlp/txt_mlp LoRA keys (may fail if vLLM expects different suffixes)")
args = ap.parse_args()
dtype_map = {"bf16": torch.bfloat16, "fp16": torch.float16, "fp32": torch.float32}
out_dtype = dtype_map[args.dtype]
in_path = Path(args.input)
if not in_path.exists():
sys.exit(f"[ERROR] Input not found: {in_path}")
print(f"[INFO] Loading: {in_path}")
tensors = load_file(str(in_path))
keys = list(tensors.keys())
fmt = detect_format(keys)
print(f"[INFO] Detected format: {fmt}")
if fmt != "qwen_transformer_blocks_comfyui":
sys.exit(
"[ERROR] This rewrite currently targets Qwen-Image ComfyUI keys like:\n"
" transformer_blocks.N.attn.to_q.lora_down.weight\n"
"If your keys differ, paste 30 keys and adjust detect_format/regex."
)
r, alpha = extract_rank_and_alpha(tensors)
print(f"[INFO] Inferred r={r}, lora_alpha={alpha}")
converted, unmapped = convert_qwen_transformer_blocks_comfyui(
tensors=tensors,
include_mlp=args.include_mlp,
dtype=out_dtype,
)
print(f"[INFO] Converted tensors: {len(converted)}")
if unmapped:
print(f"[WARN] Unmapped keys: {len(unmapped)} (showing first 20)")
for k in unmapped[:20]:
print(" ", k)
out_dir = Path(args.output)
out_dir.mkdir(parents=True, exist_ok=True)
cfg = dict(ADAPTER_CONFIG_TEMPLATE)
cfg["r"] = int(r)
cfg["lora_alpha"] = float(alpha)
cfg["base_model_name_or_path"] = args.base_model
cfg["target_modules"] = (
QWEN_TARGET_MODULES_ATTN + (QWEN_TARGET_MODULES_MLP if args.include_mlp else [])
)
(out_dir / "adapter_config.json").write_text(json.dumps(cfg, indent=2), encoding="utf-8")
save_file(converted, str(out_dir / "adapter_model.safetensors"))
print(f"[DONE] Wrote PEFT adapter dir: {out_dir}")
print(" - adapter_config.json")
print(" - adapter_model.safetensors")
if __name__ == "__main__":
main()
Usage (for Qwen-Image-Lightning)
python comfyui-to-vllm-omni-qwenimage.py \
--input Qwen-Image-Lightning-8steps-V2.0-bf16.safetensors \
--output ./out_adapter \
--dtype bf16 \
--base-model Qwen/Qwen-Image
Why this matches Qwen-Image in vLLM-Omni
- It writes LoRA keys under
...transformer...which aligns with Qwen-Image pipeline weight source prefixprefix="transformer."andself.transformer = QwenImageTransformer2DModel(...). (GitHub) - It keeps
to_q/to_k/to_vandadd_q_proj/add_k_proj/add_v_proj, which align with Qwen-Image transformer packed shard mapping (to_qkvshards andadd_kv_projshards). (GitHub) - It normalizes
to_out.0toto_outto avoid the known vLLM numeric-index LoRA failure. (GitHub) - It outputs the PEFT adapter folder vLLM-Omni requires. (vLLM)
Discussion in the ATmosphere