{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreidfj7rblh2fzt6bqptrw2tn52gvszlot54ijikhwlseeax37ouvqu",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mks6ritiqef2"
},
"path": "/t/lumen-is-a-small-but-complete-rust-first-ml-stack/175696#post_1",
"publishedAt": "2026-05-01T11:26:19.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"github.com",
"GitHub - chen-197/Lumen: Lumen 是一个轻量级、高性能的深度学习训练与推理框架,使用 Rust + CUDA编写。"
],
"textContent": "github.com\n\n### GitHub - chen-197/Lumen: Lumen 是一个轻量级、高性能的深度学习训练与推理框架,使用 Rust + CUDA编写。\n\nLumen 是一个轻量级、高性能的深度学习训练与推理框架,使用 Rust + CUDA编写。\n\n## Highlights\n\n * **Rust-first, not Rust-only** implementation\n * Rust owns the framework structure and most high-level logic.\n * CUDA C++ is used for optional GPU acceleration.\n * CPU-only builds remain available without the `cuda` feature.\n * **Dynamic autograd** built around tensor graph construction\n * **Module-style abstraction** for model components\n * **Separated layers / ops / models** for easier experimentation\n * **Flexible precision system**\n * parameter dtype\n * runtime dtype\n * activation dtype\n * KV-cache dtype\n * **Quantization-aware loading**\n * load float weights normally\n * quantize on load to `i8`\n * generate offline quantized safetensors\n * **CPU and CUDA execution paths** with explicit kernel/backend work\n * **Hugging Face`tokenizers`** integration\n * **Safetensors** support with memory-mapped and streamed loading modes\n * Release profile tuned with `lto`, `panic = \"abort\"`, and `strip`\n\n",
"title": "Lumen is a small but complete Rust-first ML stack"
}