Fine-tuning Gemma-4-E2B on MacBook M3
Hi @John6666 thank you so much for your amazing support and feedback and especially the time you take to provide comprehensive complete answers. I’m truly grateful for your support!
To give you some feedback regarding few of the points you mentioned:
If your pipeline already needed chat-template fixes: Yes I had to use a custom chat-template that includes {% generation %}.
So if your current training stack is text-only but uses a generic text collator: I found out that in the latest trl, data collator was deprecated in favor of assistant_only_loss with the customer chat-template that includes {% generation %}.
It also loads the tokenizer from google/gemma-4-E2B-it: I tried but it didn’t work. The tokenizer from google/gemma-4-E2B-it seems to be designed for inference only and does not include the right chat-template required for fine-tuning. I had to use the tokenizer from google/gemma-4-E2B and add a chat-template manually.
For a MacBook M3, that means LoRA should be your baseline, not full fine-tuning : Indeed I’m using LoRa.
Use the official chat template only: I clarified above, the official google/gemma-4-E2B-it chat-template doesn’t seem to work for fine-tuning and google/gemma-4-E2B doesn’t include a chat-template. So manual adding one is required.
Keep the dataset in standard system / user / assistant roles: Indeed I did away with the multi-modal dataset format and opted for a simple format as follows {“messages”: [{“role”: “user”, “content”: “…”}, {“role”: “assistant”, “content”: “…”}]}
Inspect the first collated batch. Check that you have input_ids, attention_mask, labels, and, if your path requires them, token_type_ids and mm_token_type_ids: My batch output is as shown below:
Batch Keys: dict_keys(['input_ids', 'labels', 'attention_mask'])
--- Decoded Input IDs --- user What products does the solution support? model Meetings, Chat, Docs, Notes, Workflows, Videos.
--- Labels (Tokens the model is trained to predict) --- Meetings, Chat, Docs, Notes, Workflows, Videos.<
TRL vs Unsloth for your case: Indeed I’m using trl.
I hope this helps give you feedback. Thank you so much!
Discussion in the ATmosphere