{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiei7nktubs56fjr3xt3l575urd2rr7wj2xaawzerjuk3ypvp5dbl4",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmto2dp4bqn2"
},
"path": "/t/accidental-attention-anchoring-repeated-phrase-in-sft-dataset-drastically-improved-context-adherence/176136#post_3",
"publishedAt": "2026-05-27T12:15:37.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "Regarding what I mentioned earlier: the phrase was inserted by Google Gemini itself while trying to ‘help’ by injecting it for some reason unknown to me. However, it didn’t mask the phrase or do anything with it; it was simply there, and I wasn’t aware of it. When I later noticed the model repeating it, I didn’t understand why, but I accepted it. Upon reviewing the training chats, I saw the phrase repeated in every single line, so the model was effectively trained to learn and repeat that specific phrase. Therefore, I abandoned that approach and proceeded with other methods that do not use anchor phrases. Regardless, I am very grateful for what you’ve shared; it has been a huge lesson in LM engineering, and thank you very much for all the links.",
"title": "Accidental Attention Anchoring? Repeated phrase in SFT dataset drastically improved context adherence"
}