External Publication
Visit Post

Google made Gemma 4 models 3x faster with MTP Drafters

TestingCatalog - TestingCatalog - New and unreleased AI feature… May 6, 2026
Source
What's new? Speculative decoding pairs a heavy main model with a light drafter to pre-generate tokens; Gemma 4 models now run on consumer GPUs and edge devices;

Discussion in the ATmosphere

Loading comments...