Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicjd6fqwc5ntrko2bgd4ag5gzjtpgkxs3kqawoe7vnfkc6me4glke",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmfljgm5fnu2"
  },
  "path": "/t/automatic-100-masking-of-the-questions-in-labels/176151#post_2",
  "publishedAt": "2026-05-21T22:14:20.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "TRL SFTTrainer docs — dataset formats, assistant-only loss, completion-only loss",
    "TRL SFTTrainer source — collator routing, dataset_text_field, completion_only_loss, masks",
    "Transformers causal language modeling docs — DataCollatorForLanguageModeling(..., mlm=False)",
    "TRL discussion — DataCollatorForCompletionOnlyLM removed; use SFTConfig(completion_only_loss=True) with prompt-completion data",
    "TRL discussion: DataCollatorForCompletionOnlyLM removed",
    "TRL docs — train on completion only",
    "TRL docs — train on assistant messages only",
    "TRL SFTTrainer source"
  ],
  "textContent": "Seems expected behaviour?\n\n* * *\n\n# Automatic `-100` masking of question/user tokens in TRL SFT: `messages` vs `prompt`/`completion`\n\n## Summary\n\nThe behavior you are seeing is expected.\n\n`transformers.DataCollatorForLanguageModeling` does **not** automatically mask the user/question part of a chat example. It is not role-aware. It does not know what a `system`, `user`, or `assistant` message is. For ordinary causal language modeling with `mlm=False`, it prepares next-token-prediction labels from the input sequence, with padding ignored. In other words, if nothing else intervenes, the usual pattern is roughly:\n\n\n    labels = input_ids.copy()\n    labels[padding_positions] = -100\n\n\nThat is **padding masking** , not **prompt/question masking**.\n\nThe current TRL replacement for the old `DataCollatorForCompletionOnlyLM` path is not “use the Transformers LM collator and hope it infers the answer span.” The current route is:\n\n  * use `SFTTrainer`\n  * use the correct dataset format\n  * use the correct `SFTConfig` loss option\n  * inspect one real batch before training\n\n\n\nThe key distinction is:\n\n\n    {\"messages\": [...]}                 -> conversational language-modeling dataset\n    {\"prompt\": ..., \"completion\": ...}   -> prompt-completion dataset\n\n\nSo, for your current `messages: system-user-assistant` dataset, `completion_only_loss=True` is not the right mental model unless you first convert the examples to a real `prompt` + `completion` format.\n\nRelevant docs / source:\n\n  * TRL SFTTrainer docs — dataset formats, assistant-only loss, completion-only loss\n  * TRL SFTTrainer source — collator routing, dataset_text_field, completion_only_loss, masks\n  * Transformers causal language modeling docs — DataCollatorForLanguageModeling(..., mlm=False)\n  * TRL discussion — DataCollatorForCompletionOnlyLM removed; use SFTConfig(completion_only_loss=True) with prompt-completion data\n\n\n\n* * *\n\n## What went wrong conceptually\n\nYou currently have data shaped like this:\n\n\n    {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"...\"},\n            {\"role\": \"user\", \"content\": \"...\"},\n            {\"role\": \"assistant\", \"content\": \"...\"},\n        ]\n    }\n\n\nThat is a **conversational dataset**.\n\nIt tells TRL:\n\n\n    This is a conversation. Apply the model's chat template.\n\n\nIt does **not** automatically mean:\n\n\n    system/user tokens -> labels = -100\n    assistant tokens   -> labels = token ids\n\n\nThat extra masking requires either:\n\n  1. a prompt-completion boundary, using `{\"prompt\": ..., \"completion\": ...}` plus `completion_only_loss=True`, or\n  2. assistant-span masks, using `{\"messages\": [...]}` plus `assistant_only_loss=True`, if the chat template supports assistant masks.\n\n\n\n* * *\n\n## Why your printed labels have no `-100`\n\nYou printed something like:\n\n\n    labels = tensor([\n        248045,   8678,    198,   2523,    513,    264,  10631,   1558,    421,\n          5529,   2708,    321,  61446,  10926,     13, 248046,    198, 248045,\n           846,    198,   4199,   1599,  12417,   1467,  68677,   3222,    506,\n         18773,     30, 248046,    198, 248045,  74455,    198, 248068,    271,\n        248069,    271,  11280,  12417,     25,   7884,  24093,  10254,    318,\n         30425,    220,     17,     15,     15,     18,    310,   6443,    220,\n            17,     15,     15,     19,    553, 248046,    198\n    ])\n\n\nThis looks like full-sequence causal-LM labels: the labels are token IDs across the whole rendered chat sequence. That usually means the trainer/collator path is training on the full conversation, not just the assistant answer.\n\nConceptually, if the rendered input is:\n\n\n    <system>\n    You are a helpful assistant.\n    </system>\n    <user>\n    What is the answer?\n    </user>\n    <assistant>\n    The answer is ...\n    </assistant>\n\n\nordinary causal-LM labeling supervises the whole sequence:\n\n\n    <system>     -> supervised\n    system text  -> supervised\n    <user>       -> supervised\n    question     -> supervised\n    <assistant>  -> supervised\n    answer       -> supervised\n\n\nBut what you expected was:\n\n\n    <system>     -> -100\n    system text  -> -100\n    <user>       -> -100\n    question     -> -100\n    <assistant>  -> maybe supervised, maybe ignored depending on template\n    answer       -> supervised\n\n\nThat second behavior is **not** provided by `transformers.DataCollatorForLanguageModeling`.\n\n* * *\n\n## The first model’s advice: what was right and what was wrong\n\n### Claim 1\n\n> Use `from trl import DataCollatorForCompletionOnlyLM`; it handles `-100` masking automatically.\n\nThis was historically plausible advice for older TRL examples, but it is not the current path. The class has been removed in current TRL versions. A TRL maintainer explicitly recommends using `completion_only_loss=True` in `SFTConfig` with a prompt-completion dataset instead:\n\n  * TRL discussion: DataCollatorForCompletionOnlyLM removed\n\n\n\nSo the correction was valid: importing `DataCollatorForCompletionOnlyLM` is no longer the right current solution.\n\n### Claim 2\n\n> Use `transformers.DataCollatorForLanguageModeling` with a prompt-completion dataset and `completion_only_loss=True`; that will take care of the masking.\n\nThis is partly right but imprecise enough to be misleading.\n\nThe precise version is:\n\n> Use **TRL`SFTTrainer`** with a **prompt-completion dataset** and `SFTConfig(completion_only_loss=True)`. Do not rely on `transformers.DataCollatorForLanguageModeling` itself to mask user/question tokens.\n\n`completion_only_loss=True` is a TRL SFT setting. It is not a setting of `transformers.DataCollatorForLanguageModeling`.\n\n* * *\n\n## The two correct ways to get response-only masking\n\n## Option A — Recommended for your case: convert `messages` to `prompt` + `completion`\n\nFor examples that are basically:\n\n\n    system + user question -> assistant answer\n\n\nthe cleanest format is:\n\n\n    {\n        \"prompt\": [\n            {\"role\": \"system\", \"content\": \"SYSTEM\"},\n            {\"role\": \"user\", \"content\": \"QUESTION\"},\n        ],\n        \"completion\": [\n            {\"role\": \"assistant\", \"content\": \"ANSWER\"},\n        ],\n    }\n\n\nThen use:\n\n\n    SFTConfig(completion_only_loss=True)\n\n\nThe TRL docs describe completion-only training as the path for prompt-completion datasets:\n\n  * TRL docs — train on completion only\n\n\n\n### Conversion code\n\n\n    from trl import SFTConfig, SFTTrainer\n\n\n    def to_prompt_completion(example):\n        messages = example[\"messages\"]\n\n        if not isinstance(messages, list):\n            raise TypeError(\"Expected example['messages'] to be a list.\")\n\n        if len(messages) < 2:\n            raise ValueError(\"Expected at least one prompt message and one assistant response.\")\n\n        if messages[-1][\"role\"] != \"assistant\":\n            raise ValueError(\n                f\"Expected final message to be assistant, got {messages[-1]['role']!r}.\"\n            )\n\n        return {\n            \"prompt\": messages[:-1],\n            \"completion\": [messages[-1]],\n        }\n\n\n    train_dataset = train_dataset.map(\n        to_prompt_completion,\n        remove_columns=train_dataset.column_names,\n    )\n\n    args = SFTConfig(\n        output_dir=\"sft-out\",\n        completion_only_loss=True,\n        packing=False,        # keep False until masking is verified\n        max_length=2048,      # adjust for your model/data\n    )\n\n    trainer = SFTTrainer(\n        model=model,\n        args=args,\n        train_dataset=train_dataset,\n        processing_class=tokenizer,\n    )\n\n\nThis is the option I would use for your described dataset.\n\nWhy? Because your real training objective is:\n\n\n    Given system + user, learn to generate assistant.\n\n\nThat is a prompt-completion objective.\n\n* * *\n\n## Option B — Keep `messages`, but use `assistant_only_loss=True`\n\nIf you want to keep the original conversational format:\n\n\n    {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"...\"},\n            {\"role\": \"user\", \"content\": \"...\"},\n            {\"role\": \"assistant\", \"content\": \"...\"},\n        ]\n    }\n\n\nthen the relevant TRL setting is:\n\n\n    SFTConfig(assistant_only_loss=True)\n\n\nThe TRL docs describe this as training only on assistant messages while ignoring user/system messages:\n\n  * TRL docs — train on assistant messages only\n\n\n\nHowever, there is an important caveat: assistant-only loss requires the chat template to support assistant-token masks using `{% generation %}` and `{% endgeneration %}` markers. TRL can patch some known model-family templates, but for other models you should check.\n\n### Template check\n\n\n    template = tokenizer.chat_template or \"\"\n\n    print(\"has generation start:\", \"{% generation %}\" in template)\n    print(\"has generation end:  \", \"{% endgeneration %}\" in template)\n\n\nIf both are false, do not blindly trust `assistant_only_loss=True`.\n\nThis is why, for simple one-turn `system-user-assistant` examples, I prefer converting to `prompt` + `completion`: it makes the boundary explicit and is easier to audit.\n\n* * *\n\n## Which option should you choose?\n\nFor your case, I would choose **Option A** :\n\n\n    {\"prompt\": messages[:-1], \"completion\": [messages[-1]]}\n\n\nwith:\n\n\n    SFTConfig(completion_only_loss=True)\n\n\nUse **Option B** only if you have genuine multi-turn conversations and want to train on every assistant turn.\n\nExample where `assistant_only_loss=True` makes sense:\n\n\n    {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"...\"},\n            {\"role\": \"user\", \"content\": \"Q1\"},\n            {\"role\": \"assistant\", \"content\": \"A1\"},\n            {\"role\": \"user\", \"content\": \"Q2\"},\n            {\"role\": \"assistant\", \"content\": \"A2\"},\n        ]\n    }\n\n\nIf you want to train only on the final assistant turn, even in a multi-turn conversation, `prompt` + `completion` is still clearer:\n\n\n    {\n        \"prompt\": [\n            {\"role\": \"system\", \"content\": \"...\"},\n            {\"role\": \"user\", \"content\": \"Q1\"},\n            {\"role\": \"assistant\", \"content\": \"A1\"},\n            {\"role\": \"user\", \"content\": \"Q2\"},\n        ],\n        \"completion\": [\n            {\"role\": \"assistant\", \"content\": \"A2\"},\n        ],\n    }\n\n\n* * *\n\n## Does `dataset_text_field=\"messages\"` help?\n\nNo, not in the way you are asking.\n\n`dataset_text_field` is not a response-mask setting. It does not mean:\n\n\n    system/user -> -100\n    assistant   -> train\n\n\nIt is for identifying the text column in standard text datasets. In the TRL source, `dataset_text_field` is described as relevant for standard dataset format, and the collator routing distinguishes language-modeling examples from prompt-completion examples:\n\n  * examples with `\"messages\"` or the configured text field go through the language-modeling path\n  * examples with `\"prompt\"` and `\"completion\"` go through the prompt-completion path\n\n\n\nRelevant source:\n\n  * TRL SFTTrainer source\n\n\n\nSo this:\n\n\n    dataset_text_field=\"messages\"\n\n\ndoes **not** turn a `messages` column into a prompt-completion dataset. It also does **not** request assistant-only masking.\n\nUse these mental models instead:\n\n\n    # Plain text language modeling\n    {\"text\": \"...\"}\n    SFTConfig(dataset_text_field=\"text\")\n\n\n\n    # Conversational assistant-only SFT\n    {\"messages\": [...]}\n    SFTConfig(assistant_only_loss=True)\n\n\n\n    # Completion-only SFT\n    {\"prompt\": [...], \"completion\": [...]}\n    SFTConfig(completion_only_loss=True)\n\n\n* * *\n\n## Why `completion_only_loss=True` is not enough for plain `messages`\n\n`completion_only_loss=True` requires a completion boundary.\n\nA plain `messages` row has roles, but it does not explicitly split the example into:\n\n\n    prompt part\n    completion part\n\n\nA prompt-completion row does:\n\n\n    {\n        \"prompt\": [...],\n        \"completion\": [...],\n    }\n\n\nThat explicit boundary is what lets TRL create a completion mask.\n\nSo this is not enough:\n\n\n    {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"...\"},\n            {\"role\": \"user\", \"content\": \"...\"},\n            {\"role\": \"assistant\", \"content\": \"...\"},\n        ]\n    }\n\n\nwith:\n\n\n    SFTConfig(completion_only_loss=True)\n\n\nFor this dataset shape, use either:\n\n\n    SFTConfig(assistant_only_loss=True)\n\n\nor convert to:\n\n\n    {\n        \"prompt\": messages[:-1],\n        \"completion\": [messages[-1]],\n    }\n\n\nand then use:\n\n\n    SFTConfig(completion_only_loss=True)\n\n\n* * *\n\n## Do not pass the wrong collator\n\nIf you want TRL SFT masking, I would not pass this:\n\n\n    from transformers import DataCollatorForLanguageModeling\n\n    data_collator = DataCollatorForLanguageModeling(\n        tokenizer=tokenizer,\n        mlm=False,\n    )\n\n\nThat collator is for ordinary causal LM. It is not the mechanism that masks user/question tokens.\n\nPrefer:\n\n\n    trainer = SFTTrainer(\n        model=model,\n        args=SFTConfig(completion_only_loss=True),\n        train_dataset=prompt_completion_dataset,\n        processing_class=tokenizer,\n    )\n\n\nand let TRL’s SFT pipeline create/apply the relevant masks.\n\nThe TRL source shows the important mechanism: labels are initially cloned from `input_ids`, padding positions are set to `-100`, and then completion masks / assistant masks can further set non-target labels to `-100`.\n\nRelevant source:\n\n  * TRL SFTTrainer source\n\n\n\n* * *\n\n## Important caveat: `formatting_func` can remove the boundary\n\nBe careful with `formatting_func`.\n\nIf you convert your examples to `prompt` + `completion`, but then pass a `formatting_func` that renders everything into one string, you may accidentally turn the dataset back into ordinary language modeling.\n\nRisky pattern:\n\n\n    def formatting_func(example):\n        return render_prompt_and_answer_as_one_string(example)\n\n    trainer = SFTTrainer(\n        model=model,\n        args=SFTConfig(completion_only_loss=True),\n        train_dataset=prompt_completion_dataset,\n        formatting_func=formatting_func,\n        processing_class=tokenizer,\n    )\n\n\nSafer pattern:\n\n\n    trainer = SFTTrainer(\n        model=model,\n        args=SFTConfig(completion_only_loss=True),\n        train_dataset=prompt_completion_dataset,\n        processing_class=tokenizer,\n    )\n\n\nLet the trainer see the structured `prompt` and `completion` fields.\n\n* * *\n\n## How to verify that masking is correct\n\nDo not only print the numeric tensor. Decode the supervised span.\n\n### Helper 1 — Decode the full input and supervised labels\n\n\n    def inspect_supervised_span(trainer, tokenizer, row=0):\n        batch = next(iter(trainer.get_train_dataloader()))\n\n        input_ids = batch[\"input_ids\"][row].tolist()\n        labels = batch[\"labels\"][row].tolist()\n\n        supervised_ids = [\n            token_id\n            for token_id, label_id in zip(input_ids, labels)\n            if label_id != -100\n        ]\n\n        ignored_count = sum(label_id == -100 for label_id in labels)\n        supervised_count = sum(label_id != -100 for label_id in labels)\n\n        print(\"=\" * 100)\n        print(\"FULL INPUT\")\n        print(tokenizer.decode(input_ids, skip_special_tokens=False))\n\n        print(\"=\" * 100)\n        print(\"SUPERVISED LABEL SPAN\")\n        print(tokenizer.decode(supervised_ids, skip_special_tokens=False))\n\n        print(\"=\" * 100)\n        print(\"COUNTS\")\n        print(\"total tokens:     \", len(input_ids))\n        print(\"ignored tokens:   \", ignored_count)\n        print(\"supervised tokens:\", supervised_count)\n\n\nGood result:\n\n\n    FULL INPUT\n    <system>...<user>QUESTION<assistant>ANSWER...\n\n    SUPERVISED LABEL SPAN\n    <assistant>ANSWER...\n\n\nAlso possibly good, depending on template boundary:\n\n\n    SUPERVISED LABEL SPAN\n    ANSWER...\n\n\nBad result:\n\n\n    SUPERVISED LABEL SPAN\n    <system>...<user>QUESTION<assistant>ANSWER...\n\n\nThat means you are still training on the question.\n\n### Helper 2 — Token-by-token label table\n\n\n    def print_label_table(trainer, tokenizer, n_tokens=200, row=0):\n        batch = next(iter(trainer.get_train_dataloader()))\n\n        input_ids = batch[\"input_ids\"][row].tolist()\n        labels = batch[\"labels\"][row].tolist()\n\n        for i, (token_id, label_id) in enumerate(zip(input_ids, labels)):\n            if i >= n_tokens:\n                break\n\n            token = tokenizer.decode([token_id], skip_special_tokens=False)\n            status = \"LOSS\" if label_id != -100 else \"IGNORED\"\n\n            print(\n                f\"{i:04d} | {status:7s} | \"\n                f\"input_id={token_id:8d} | label={label_id:8d} | {token!r}\"\n            )\n\n\nExpected pattern:\n\n\n    0000 | IGNORED | system marker\n    0001 | IGNORED | system text\n    ...\n    0015 | IGNORED | user marker\n    0016 | IGNORED | user question\n    ...\n    0030 | LOSS    | assistant marker or answer\n    0031 | LOSS    | answer text\n\n\nBad pattern:\n\n\n    0000 | LOSS | system marker\n    ...\n    0015 | LOSS | user marker\n    0016 | LOSS | user question\n\n\n* * *\n\n## Should the assistant role marker be supervised?\n\nYou may see either of these after conversion:\n\n\n    SUPERVISED LABEL SPAN\n    <assistant>ANSWER<end>\n\n\nor:\n\n\n    SUPERVISED LABEL SPAN\n    ANSWER<end>\n\n\nBoth can be defensible.\n\nUsually acceptable:\n\n\n    <assistant>ANSWER<end>\n\n\nThis teaches the model to produce a full assistant turn in the model’s chat format.\n\nAlso acceptable:\n\n\n    ANSWER<end>\n\n\nThis treats the assistant-start marker as part of the prompt and supervises only the answer body.\n\nUsually not what you want:\n\n\n    <system>...<user>QUESTION<assistant>ANSWER<end>\n\n\nThat trains on the system/user prompt too.\n\nThe key thing is that the user/question tokens should be `-100` if your goal is response-only SFT.\n\n* * *\n\n## What `-100` means here\n\n`-100` is the ignore index commonly used by PyTorch/Hugging Face causal-LM training. In this context:\n\n\n    label = token id -> compute loss\n    label = -100     -> ignore this token position in the loss\n\n\nIt does **not** mean the token is hidden from the model.\n\nThe ideal response-only SFT setup is:\n\n\n    system/user tokens:\n      visible to the model as context\n      ignored in the loss\n\n    assistant tokens:\n      visible to the model\n      included in the loss\n\n\nThis is exactly what you usually want for instruction tuning: the model conditions on the prompt, but it is not trained to reproduce the prompt.\n\n* * *\n\n## Practical final answer\n\nThere is no Transformers `DataCollatorForLanguageModeling` setting that says “mask user/question tokens.” That collator is not chat-role-aware.\n\nThere are TRL SFT settings that can do the masking, but they depend on dataset format:\n\n  1. For a `messages` conversational dataset, use:\n\n\n\n\n    SFTConfig(assistant_only_loss=True)\n\n\nbut verify that the chat template supports assistant masks.\n\n  2. For a prompt-completion dataset, use:\n\n\n\n\n    SFTConfig(completion_only_loss=True)\n\n\nwith examples shaped like:\n\n\n    {\n        \"prompt\": [\n            {\"role\": \"system\", \"content\": \"...\"},\n            {\"role\": \"user\", \"content\": \"...\"},\n        ],\n        \"completion\": [\n            {\"role\": \"assistant\", \"content\": \"...\"},\n        ],\n    }\n\n\nFor your `system-user-assistant` examples, I would convert to `prompt` + `completion` and use `completion_only_loss=True`.\n\n* * *\n\n## Minimal working pattern for your case\n\n\n    from trl import SFTConfig, SFTTrainer\n\n\n    def to_prompt_completion(example):\n        messages = example[\"messages\"]\n\n        if messages[-1][\"role\"] != \"assistant\":\n            raise ValueError(\"Expected the final message to be the assistant answer.\")\n\n        return {\n            \"prompt\": messages[:-1],\n            \"completion\": [messages[-1]],\n        }\n\n\n    train_dataset = train_dataset.map(\n        to_prompt_completion,\n        remove_columns=train_dataset.column_names,\n    )\n\n    args = SFTConfig(\n        output_dir=\"sft-out\",\n        completion_only_loss=True,\n        packing=False,\n        max_length=2048,\n    )\n\n    trainer = SFTTrainer(\n        model=model,\n        args=args,\n        train_dataset=train_dataset,\n        processing_class=tokenizer,\n    )\n\n    inspect_supervised_span(trainer, tokenizer)\n\n\nOnly after the supervised span is correct:\n\n\n    trainer.train()\n\n\n* * *\n\n## Checklist\n\nUse this checklist before training:\n\n  * Do I want to train only on the final assistant answer?\n\n    * Use `prompt` + `completion`.\n    * Use `completion_only_loss=True`.\n  * Do I want to train on all assistant turns in a multi-turn conversation?\n\n    * Keep `messages`.\n    * Use `assistant_only_loss=True`.\n    * Verify the chat template supports `{% generation %}` / `{% endgeneration %}`.\n  * Am I using `transformers.DataCollatorForLanguageModeling` and expecting question masking?\n\n    * That is the wrong expectation.\n  * Am I setting `dataset_text_field=\"messages\"` and expecting role masking?\n\n    * That is not what `dataset_text_field` does.\n  * Did I decode `labels != -100` from a real dataloader batch?\n\n    * If not, do that before training.\n\n\n\n* * *\n\n## Bottom line\n\nYour printed tensor shows ordinary full-sequence causal-LM behavior.\n\nThe fix is not a hidden `DataCollatorForLanguageModeling` setting.\n\nFor your case, the clearest fix is:\n\n\n    messages -> prompt/completion\n\n\nthen:\n\n\n    SFTConfig(completion_only_loss=True)\n\n\nand verify with:\n\n\n    decode(input_ids where labels != -100)\n\n\n`dataset_text_field=\"messages\"` does not make `messages` behave like `prompt` + `completion`, and it does not create automatic user/question masking.",
  "title": "Automatic -100 masking of the questions in Labels"
}