Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidyuu7fptg3sr3oteeobqzwuqdsfaxjrkkkbf3np6sybb4yf7qx24",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkubvrhmngj2"
  },
  "path": "/t/take-a-look-at-the-neural-network-framework-i-wrote-which-is-implemented-in-rust-cuda/175709#post_1",
  "publishedAt": "2026-05-02T08:44:12.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "github.com",
    "GitHub - chen-197/Lumen: Lumen 是一个轻量级、高性能的深度学习训练与推理框架，使用 Rust + CUDA编写。"
  ],
  "textContent": "github.com\n\n### GitHub - chen-197/Lumen: Lumen 是一个轻量级、高性能的深度学习训练与推理框架，使用 Rust + CUDA编写。\n\nLumen 是一个轻量级、高性能的深度学习训练与推理框架，使用 Rust + CUDA编写。\n\n## Highlights\n\n  * **Rust-first, not Rust-only** implementation\n    * Rust owns the framework structure and most high-level logic.\n    * CUDA C++ is used for optional GPU acceleration.\n    * CPU-only builds remain available without the `cuda` feature.\n  * **Dynamic autograd** built around tensor graph construction\n  * **Module-style abstraction** for model components\n  * **Separated layers / ops / models** for easier experimentation\n  * **Flexible precision system**\n    * parameter dtype\n    * runtime dtype\n    * activation dtype\n    * KV-cache dtype\n  * **Quantization-aware loading**\n    * load float weights normally\n    * quantize on load to `i8`\n    * generate offline quantized safetensors\n  * **CPU and CUDA execution paths** with explicit kernel/backend work\n  * **Hugging Face`tokenizers`** integration\n  * **Safetensors** support with memory-mapped and streamed loading modes\n  * Release profile tuned with `lto`, `panic = \"abort\"`, and `strip`\n\n",
  "title": "Take a look at the neural network framework I wrote, which is implemented in Rust + CUDA"
}