Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibqcvflxsooqmuj7zcwnv5qdgvh3esqxsstooq3d3hdl4gov3nufa",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mjff2l4pe6i2"
  },
  "path": "/t/fine-tuning-gemma-4-e2b-on-macbook-m3/175228#post_1",
  "publishedAt": "2026-04-13T16:57:44.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Hi. I’m trying to fine-tune Gemma-4-E2B on MacBook M3 but I haven’t been able to do so. I had previously fine-tuned Llama and Qwen models with no issues. Gemme-4 is presenting real challenges. Having resolved the linear layers selection issue and the tokenizer chat template issue, now I’m stuck with high Training Loss rates > 40 that refuse to decrease. Some info I found blame bfloat16 support on MacBooks. I tried using float16 and even float32 but the model exploded after few epochs.\n\nI wonder if there’s any Gemma-4-E2B/E4B best settings info or gotchas list to watch out for or maybe a guide that helps overcome these issues? Any guidance would be truly appreciated!",
  "title": "Fine-tuning Gemma-4-E2B on MacBook M3"
}