Train and Evaluation loss drop between epochs
Hugging Face Forums [Unofficial]
March 19, 2026
Hi everyone.
I’m training a LoRA adapter for a model using the SFT trainer and also set an evaluation set. In my case, the train and evaluation sets are completely unrelated.
What I observe is a drop in loss between the epochs. It makes perfect sense for the train loss. the model already saw the examples, thus the loss is expected to drop. But I can’t reason about why is there a drop in the evaluation loss, as the examples are not related and there is no sign of contamination.
Any ideas and advice would be highly appreciated, thank you in advance.
Adding the train/eval loss from W&B, I’ve ran 2 training epochs in this session on 4 GPUs on the same node.
Discussion in the ATmosphere