{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiaeyimzd6o2aryemu3bmf4yrno644b726vn3g6hsjvdwjmb7l6b2y",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmotkb4ryew2"
  },
  "path": "/t/vlm-fine-tuning-near-zero-training-loss-but-poor-inference-accuracy-on-train-set-gemma-4-e2b-it/176224#post_1",
  "publishedAt": "2026-05-25T15:22:19.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Hi everyone\n\nI am currently fine-tuning the **Gemma 4 E2B** model for a worker safety project. My goal is to classify whether a worker is using a stepladder safely based on specific safety guidelines (e.g., step position, orientation, and ladder stability).\n\n**The Problem:** I am facing a strange behavior: My **Training Loss converges to near zero (~0.001)** very quickly. However, when I run inference on the **exact same training images** to calculate metrics, the performance is extremely poor (Accuracy ~50%, with a heavy bias towards the “unsafe” class).\n\n**Dataset Format:** I reformatted my dataset so the Assistant outputs a single JSON string. I also provide the bounding box of the ladder in the User prompt to focus the model’s attention.\n\n{\n“messages”: [\n{ “role”: “system”, “content”: “You are a safety vision model… [Detailed Safety Rules]… Output JSON only.” },\n{ “role”: “user”, “content”: [\n{“type”: “image”, “image”: “<PIL.Image>”},\n{“type”: “text”, “text”: “Inspect the stepladder…”}\n]\n},\n{ “role”: “assistant”, “content”: [{“type”: “text”, “text”: “[{“id”: “0”, “label”: “unsafe”}]”}] }\n]\n}\n\nFramework & Environment:\n\n  * Training Tool: Unsloth Studio (Web UI)\n\n  * Base Model**:** Gemma-4 E2B it\n\n  * PEFT Method**:** LoRA (Fine-tuning both Vision and Language adapters)\n\n\n\n\nHas anyone encountered this “Zero Loss but Zero Performance” issue with Gemma VLM or similar models?\nPlease help me now i am so stuck",
  "title": "VLM Fine tuning: Near-Zero Training Loss but Poor Inference Accuracy on Train Set (Gemma 4 E2B It)"
}