{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreicln7pt6nxlk6gunjbnddno5umkocjl3ec7yrbxqlffpxiq3n7kv4",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mm4abd6jnbs2"
},
"path": "/t/vbs-nn-512k-context-on-a-12gb-gpu-linear-o-n-byte-stream/176081#post_1",
"publishedAt": "2026-05-18T00:22:06.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"GitHub - ega4l/VBS-NN: VertexByteStream Neural Network (VBS-NN) · GitHub"
],
"textContent": "Hello. I developed **VertexByteStream (VBS-NN)**, a tokenizer-free ( can use also tokenizer ) hierarchical architecture scaling linearly O(N). It successfully passed needle-in-a-haystack stress tests up to **512k context length** on consumer hardware. * **Metrics:** VRAM stays under 11.7 GB at 512k context (tested on RTX 3060 12GB / RX 6700XT). Current prototype uses `d_model = 64`. * **Reproducibility:** Core code is proprietary. Pre-compiled binaries and CUDA/ROCm Docker envs are provided to verify VRAM metrics. Project repository, runners, and raw evaluation logs are available here: GitHub: GitHub - ega4l/VBS-NN: VertexByteStream Neural Network (VBS-NN) · GitHub",
"title": "VBS-NN: 512k Context on a 12GB GPU (Linear O(N) Byte-Stream)"
}