VBS-NN: 512k Context on a 12GB GPU (Linear O(N) Byte-Stream)
Hugging Face Forums [Unofficial]
May 18, 2026
Hello. I developed VertexByteStream (VBS-NN), a tokenizer-free ( can use also tokenizer ) hierarchical architecture scaling linearly O(N). It successfully passed needle-in-a-haystack stress tests up to 512k context length on consumer hardware. * Metrics: VRAM stays under 11.7 GB at 512k context (tested on RTX 3060 12GB / RX 6700XT). Current prototype uses d_model = 64. * Reproducibility: Core code is proprietary. Pre-compiled binaries and CUDA/ROCm Docker envs are provided to verify VRAM metrics. Project repository, runners, and raw evaluation logs are available here: GitHub: GitHub - ega4l/VBS-NN: VertexByteStream Neural Network (VBS-NN) · GitHub
Discussion in the ATmosphere