{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidfj7rblh2fzt6bqptrw2tn52gvszlot54ijikhwlseeax37ouvqu",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mks6ritiqef2"
  },
  "path": "/t/lumen-is-a-small-but-complete-rust-first-ml-stack/175696#post_1",
  "publishedAt": "2026-05-01T11:26:19.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "github.com",
    "GitHub - chen-197/Lumen: Lumen 是一个轻量级、高性能的深度学习训练与推理框架,使用 Rust + CUDA编写。"
  ],
  "textContent": "github.com\n\n### GitHub - chen-197/Lumen: Lumen 是一个轻量级、高性能的深度学习训练与推理框架,使用 Rust + CUDA编写。\n\nLumen 是一个轻量级、高性能的深度学习训练与推理框架,使用 Rust + CUDA编写。\n\n## Highlights\n\n  * **Rust-first, not Rust-only** implementation\n    * Rust owns the framework structure and most high-level logic.\n    * CUDA C++ is used for optional GPU acceleration.\n    * CPU-only builds remain available without the `cuda` feature.\n  * **Dynamic autograd** built around tensor graph construction\n  * **Module-style abstraction** for model components\n  * **Separated layers / ops / models** for easier experimentation\n  * **Flexible precision system**\n    * parameter dtype\n    * runtime dtype\n    * activation dtype\n    * KV-cache dtype\n  * **Quantization-aware loading**\n    * load float weights normally\n    * quantize on load to `i8`\n    * generate offline quantized safetensors\n  * **CPU and CUDA execution paths** with explicit kernel/backend work\n  * **Hugging Face`tokenizers`** integration\n  * **Safetensors** support with memory-mapped and streamed loading modes\n  * Release profile tuned with `lto`, `panic = \"abort\"`, and `strip`\n\n",
  "title": "Lumen is a small but complete Rust-first ML stack"
}