{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreihfc67g3xrisr6d5nevqiphanqwpapvzayhcpgtdalehh2sqicfue",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mimpvid2sms2"
},
"path": "/t/peft-0-18-1-crashing-when-fine-tuning/174916#post_2",
"publishedAt": "2026-04-03T20:40:56.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"PyPI",
"GitHub",
"Hugging Face"
],
"textContent": "Gemma 4 is so new—or rather, it’s still in its infancy—that I think it’ll take a little time before it works perfectly across the entire HF ecosystem…\n\nBut it looks like it’ll barely work out somehow :\n\n* * *\n\nThis is a **known upstream compatibility bug** , not a one-off mistake in your setup. There is a workaround. There is **no public release ETA** yet. As of now, PyPI still lists **PEFT 0.18.1** as the latest release, while the PEFT `main` branch has already moved to **0.18.2.dev0**. The Gemma 4 support issue is open and currently shows **no assignee, no milestone, and no linked PR** , so there is no evidence of an imminent packaged fix yet. (PyPI)\n\n## What is happening\n\nYour error:\n\n\n ValueError: Target module Gemma4ClippableLinear(...) is not supported\n\n\nis happening because PEFT LoRA only knows how to inject adapters into a specific set of module types such as `torch.nn.Linear`, `Embedding`, `Conv*`, `Conv1D`, and `MultiheadAttention`. The open PEFT Gemma 4 issue explains that **`Gemma4ClippableLinear` is a wrapper `nn.Module`, not a subclass of `nn.Linear`**, so PEFT rejects it during LoRA injection. The same issue notes that this happens **before** `exclude_modules` can protect you. (GitHub)\n\n## Why Gemma 4 E2B triggers this\n\nGemma 4 E2B is not just a plain text decoder. The official model card says Gemma 4 is **multimodal** , with **text and image input across the family** , and **audio support on the small E2B and E4B variants**. The Hugging Face Gemma 4 launch post says the same thing and describes E2B/E4B as the small variants with audio support. That matters because the PEFT issue identifies `Gemma4ClippableLinear` as being used in the **vision/audio encoder**. (Hugging Face)\n\nSo the practical root cause is usually:\n\n * you are using **broad LoRA targeting** , and\n * PEFT walks into Gemma 4’s multimodal towers,\n * then it hits `Gemma4ClippableLinear`,\n * then it crashes. (Hugging Face)\n\n\n\n## Why this often surprises people\n\nPEFT’s own LoRA docs recommend `target_modules=\"all-linear\"` for QLoRA-style training. That works well on many standard text-only transformer models. But PEFT also documents that `target_modules` behaves broadly:\n\n * if you pass a **string** , it uses **regex matching** ,\n * if you pass a **list of strings** , it uses **exact match or suffix match** ,\n * and if you pass **`\"all-linear\"`** , it targets **all linear/Conv1D modules**. (Hugging Face)\n\n\n\nOn a new multimodal model like Gemma 4 E2B, that convenience becomes a hazard. A setting that is fine for Llama-style text-only training can unintentionally reach vision or audio blocks.\n\n## Will there be an update soon\n\nThere will likely be an update **eventually** , but there is **no public timeline** I can justify from the current public state. The issue is open. It has no assignee, no milestone, and no branches or pull requests attached. That means “soon” would be speculation. (GitHub)\n\n## The best workaround depends on what you are actually fine-tuning\n\n### Case 1. You are doing **text-only fine-tuning**\n\nThis is the cleanest path, and probably the best one for most users fine-tuning E2B on text data.\n\nIn this case, the best fix is usually:\n\n 1. **Do not use`target_modules=\"all-linear\"`**\n 2. **Do not use loose suffix-only target lists** like `[\"q_proj\", \"v_proj\", \"o_proj\"]`\n 3. Build an **explicit list of full module names** from the **text backbone only**\n 4. Skip anything under vision or audio towers\n 5. Keep only actual `nn.Linear` layers in the text path\n\n\n\nThat works because PEFT will then never touch the unsupported multimodal wrapper modules. This approach is more stable than monkey-patching if your task is text-only. The PEFT docs confirm how target matching works, and the custom-models docs recommend inspecting which modules were actually adapted. (Hugging Face)\n\nA discovery pattern like this is the safest starting point:\n\n\n import torch.nn as nn\n\n TEXT_SUFFIXES = (\n \"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n \"gate_proj\", \"up_proj\", \"down_proj\",\n )\n\n def discover_text_targets(model):\n out = []\n for name, module in model.named_modules():\n lname = name.lower()\n\n # Skip multimodal branches\n if \"vision\" in lname or \"audio\" in lname:\n continue\n\n # Keep only typical text LoRA targets\n if not name.endswith(TEXT_SUFFIXES):\n continue\n\n # Only ordinary linear layers\n if isinstance(module, nn.Linear):\n out.append(name)\n\n return out\n\n target_modules = discover_text_targets(model)\n print(\"\\n\".join(target_modules[:50]))\n print(f\"Found {len(target_modules)} targets\")\n\n\nThen:\n\n\n from peft import LoraConfig, get_peft_model\n\n peft_config = LoraConfig(\n r=16,\n lora_alpha=32,\n lora_dropout=0.05,\n bias=\"none\",\n task_type=\"CAUSAL_LM\",\n target_modules=target_modules,\n )\n\n model = get_peft_model(model, peft_config)\n model.print_trainable_parameters()\n\n # Useful sanity check\n print(model.targeted_module_names)\n\n\nThat last check matters. PEFT’s docs explicitly recommend checking trainable parameters and the targeted modules to confirm you only adapted the intended layers. (Hugging Face)\n\n### Case 2. You are doing **actual multimodal fine-tuning**\n\nIf you really want to fine-tune image or audio paths too, then the current practical workaround is the one documented in the open PEFT issue: **monkey-patch`Gemma4ClippableLinear` before loading the model** so PEFT sees a supported linear-like class. The issue includes this workaround and reports that QLoRA then proceeds normally. (GitHub)\n\nUse this **before** `from_pretrained()`:\n\n\n import torch\n import torch.nn as nn\n from transformers.models.gemma4 import modeling_gemma4\n\n class PatchedClippableLinear(nn.Linear):\n def __init__(self, config, in_features, out_features):\n nn.Linear.__init__(self, in_features, out_features, bias=False)\n self.use_clipped_linears = getattr(config, \"use_clipped_linears\", False)\n\n if self.use_clipped_linears:\n self.register_buffer(\"input_min\", torch.tensor(-float(\"inf\")))\n self.register_buffer(\"input_max\", torch.tensor(float(\"inf\")))\n self.register_buffer(\"output_min\", torch.tensor(-float(\"inf\")))\n self.register_buffer(\"output_max\", torch.tensor(float(\"inf\")))\n\n def forward(self, x):\n if self.use_clipped_linears:\n x = torch.clamp(x, self.input_min, self.input_max)\n\n out = nn.Linear.forward(self, x)\n\n if self.use_clipped_linears:\n out = torch.clamp(out, self.output_min, self.output_max)\n\n return out\n\n modeling_gemma4.Gemma4ClippableLinear = PatchedClippableLinear\n\n\nThis is not elegant. It is just the fastest public workaround.\n\n## A second issue you are likely to hit right after this one\n\nEven if you get past the PEFT crash, Gemma 4 currently has another training friction point: the open Transformers issue says **text-only fine-tuning still requires both`token_type_ids` and `mm_token_type_ids`**, and that adding zero tensors for both works. That issue also says that for TRL SFT, you typically need a custom collator and `remove_unused_columns=False`. (GitHub)\n\nMinimal pattern:\n\n\n def add_type_ids(batch):\n zeros = torch.zeros_like(batch[\"input_ids\"])\n batch[\"token_type_ids\"] = zeros\n batch[\"mm_token_type_ids\"] = zeros\n return batch\n\n\nAnd if you use TRL:\n\n\n training_args.remove_unused_columns = False\n\n\nThis is a separate bug from the PEFT problem, but it is highly relevant because many people will hit it immediately after fixing the LoRA injection error. (GitHub)\n\n## Version guidance\n\nRight now the version picture is:\n\n * **PEFT 0.18.1** is the latest release on PyPI. (PyPI)\n * PEFT `main` already says **0.18.2.dev0** , but that is not the same as a released fix. (GitHub)\n * **Transformers 5.5.0** is the latest PyPI release, published on April 2, 2026. (PyPI)\n\n\n\nSo if you are on `peft==0.18.1`, you are **not missing a newer public PEFT release**.\n\n## What I would do in your exact situation\n\n### If your data is text-only\n\nI would do this, in order:\n\n 1. Upgrade to **`transformers==5.5.0`**\n 2. Keep **`peft==0.18.1`**\n 3. Stop using `all-linear`\n 4. Build **exact full target names** for text-only linear modules\n 5. Verify them with `print_trainable_parameters()` and `targeted_module_names`\n 6. Add zero `token_type_ids` and `mm_token_type_ids` in the collator (PyPI)\n\n\n\n### If you need multimodal LoRA now\n\nUse the monkey-patch, then add the type-id workaround, then proceed. For multimodal fine-tuning specifically, Hugging Face’s Gemma 4 launch post says `AutoModelForMultimodalLM` is the lower-level class especially useful for fine-tuning, and the built-in chat template should be used to avoid prompt-formatting mistakes. Google’s official Gemma guides also provide separate text and vision QLoRA tutorials. (Hugging Face)\n\n## Bottom line\n\nYour crash is caused by a **real PEFT incompatibility with Gemma 4’s`Gemma4ClippableLinear` modules**. There is **no public ETA** for a packaged fix. The best current workaround is:\n\n * **text-only training:** restrict LoRA to exact text decoder `nn.Linear` module names\n * **multimodal training:** monkey-patch `Gemma4ClippableLinear` before loading the model\n * in both cases, be ready to also add **`token_type_ids` and `mm_token_type_ids`** during training (GitHub)\n\n",
"title": "Peft 0.18.1 crashing when fine-tuning"
}