External Publication
Visit Post

Fine-tuning microsoft/harrier-oss-v1-270m with SentenceTransformerTrainer — is it supported?

Hugging Face Forums [Unofficial] May 12, 2026
Source

I recently fine-tuned BAAI/bge-m3 for a Portuguese QA retrieval task using SentenceTransformerTrainer with MultipleNegativesRankingLoss, and it works well.

I’d now like to try microsoft/harrier-oss-v1-270m as the base model, since it achieves better results on Multilingual MTEB v2. The model card confirms it is compatible with SentenceTransformers, so that part is clear.

However, I have some questions specific to fine-tuning this model:

  1. The model card states that queries should include a task instruction (e.g. Instruct: ... Query: ...) but documents should not. When fine-tuning with MultipleNegativesRankingLoss, should the instruction prefix be applied to the anchor texts during training, or only at inference?
  2. Are there any known challenges or recommended adaptations when fine-tuning decoder-only embedding models with SentenceTransformers, compared to encoder-based models like BGE-M3?
  3. Any recommended starting hyperparameters (learning rate, batch size) for this architecture?

Any guidance or pointers to examples would be appreciated.

Discussion in the ATmosphere

Loading comments...