Removing the embedding from my embedding: a byte transformer with a 0-parameter input layer (25M, single RTX 4070)Hugging Face Forums [Unofficial]·5d ago·12 min readShaham & Levy, Neural Machine Translation without EmbeddingsCANINECharformerMEGABYTE
Removing the embedding from my embedding: a byte transformer with a 0-parameter input layer (25M, single RTX 4070)Hugging Face Forums [Unofficial]·Jun 13·11 min readShaham & Levy, Neural Machine Translation without EmbeddingsShaham & LevyCANINECharformer