{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicln7pt6nxlk6gunjbnddno5umkocjl3ec7yrbxqlffpxiq3n7kv4",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mm4abd6jnbs2"
  },
  "path": "/t/vbs-nn-512k-context-on-a-12gb-gpu-linear-o-n-byte-stream/176081#post_1",
  "publishedAt": "2026-05-18T00:22:06.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "GitHub - ega4l/VBS-NN: VertexByteStream Neural Network (VBS-NN) · GitHub"
  ],
  "textContent": "Hello. I developed **VertexByteStream (VBS-NN)**, a tokenizer-free ( can use also tokenizer ) hierarchical architecture scaling linearly O(N). It successfully passed needle-in-a-haystack stress tests up to **512k context length** on consumer hardware. * **Metrics:** VRAM stays under 11.7 GB at 512k context (tested on RTX 3060 12GB / RX 6700XT). Current prototype uses `d_model = 64`. * **Reproducibility:** Core code is proprietary. Pre-compiled binaries and CUDA/ROCm Docker envs are provided to verify VRAM metrics. Project repository, runners, and raw evaluation logs are available here:  GitHub: GitHub - ega4l/VBS-NN: VertexByteStream Neural Network (VBS-NN) · GitHub",
  "title": "VBS-NN: 512k Context on a 12GB GPU (Linear O(N) Byte-Stream)"
}