Removing the embedding from my embedding: a byte transformer with a 0-parameter input layer (25M, single RTX 4070)Hugging Face Forums [Unofficial]·5d ago·12 min readShaham & Levy, Neural Machine Translation without EmbeddingsCANINECharformerMEGABYTE
Removing the embedding from my embedding: a byte transformer with a 0-parameter input layer (25M, single RTX 4070)Hugging Face Forums [Unofficial]·Jun 13·11 min readShaham & Levy, Neural Machine Translation without EmbeddingsShaham & LevyCANINECharformer
HoLo/HSL: a 100M change-rate-based multimodal toy model on a single RTX 4070Hugging Face Forums [Unofficial]·Jun 9·9 min readByT5MEGABYTEByte Latent Transformer / BLTBLT