Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicgmugjc3ckn7rcq4hv24q3lcrtw2jaxktjiml2x4nzqqeez5euzi",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3miix2rghjiz2"
  },
  "path": "/t/bug-in-google-colab-assemble-everything-pytorch/174892#post_1",
  "publishedAt": "2026-04-02T09:19:33.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "section6_pt.ipynb - Colab"
  ],
  "textContent": "Description :\n\n  * Notebook link : section6_pt.ipynb - Colab \n\n  * Error: When executing the 2nd cell of the notebook, the following error:\n\n  * TypeError                                 Traceback (most recent call last)\n\n\n\n        /tmp/ipykernel_6268/743095204.py in <cell line: 0>()\n              2\n              3 checkpoint = \"tblard/tf-allocine\"\n        ----> 4 tokenizer = AutoTokenizer.from_pretrained(checkpoint)\n              5\n              6 sequence = \"J'ai attendu un cours d’HuggingFace toute ma vie.\"\n\n\n\n\n* * *\n\n3 frames\n\n* * *\n\n        /usr/local/lib/python3.12/dist-packages/transformers/models/camembert/tokenization_camembert.py in __init__(self, bos_token, eos_token, sep_token, cls_token, unk_token, pad_token, mask_token, additional_special_tokens, add_prefix_space, vocab_file, vocab, **kwargs)\n            117             self._vocab = vocab\n            118             unk_index = next((i for i, (tok, _) in enumerate(self._vocab) if tok == str(unk_token)), 0)\n        --> 119             self._tokenizer = Tokenizer(Unigram(self._vocab, unk_id=unk_index, byte_fallback=False))\n            120         else:\n            121             self._vocab = [\n\n\n\n\n        TypeError: argument 'vocab': 'str' object cannot be converted to 'PyTuple'\n\n\n  * Model concerned : tblard/tf-allocine\n\n\n\n\nWould it be possible to have a correction of the notebook in order to be able to run and test the code without errors?",
  "title": "Bug in Google Colab Assemble Everything (PyTorch)"
}