SFTTrainerflags blocks assistant_only_loss=True
Hi @John6666 thank you so much for your truly valuable feedback as always. I really appreciate it!
Regarding the action items you mentioned:
- force the text/tokenizer path by explicitly passing the tokenizer as
processing_class:
I’m currently doing it. However, at first, I was not directly passing the tokenizer to the trainer as an SFTTrainer parameter. Instead, I was passing it to the data_collator which is then passed to the trainer:
from transformers import DataCollatorForLanguageModeling data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
trainer = SFTTrainer( model=model, train_dataset=train_dataset, eval_dataset=eval_dataset, data_collator=data_collator, args=sft_config, peft_config=peft_config, )
After I read your message, I passed it to the sfttrainer and the error disappeared:
trainer = SFTTrainer( model=model, train_dataset=train_dataset, eval_dataset=eval_dataset, processing_class=tokenizer, data_collator=data_collator, args=sft_config, peft_config=peft_config, )
I will trade the buff output tomorrow to check if any masking took place and I’ll update you. Thanks!
Discussion in the ATmosphere