{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreicyi5z5c66drjyanbnaqi7y4txrxvruf4v7lz3sdmiptp6llzphla",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mejuvloefb62"
},
"path": "/t/high-performance-zero-copy-tensor-serialization-for-inference/173304#post_1",
"publishedAt": "2026-02-10T20:52:18.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"github.com",
"GitHub - Khushiyant/tenso: High-performance zero-copy tensor serialization..."
],
"textContent": "We’re too comfortable with serialization that treats high-end silicon like a text parser. Tenso eliminates the invisible tax where formats like SafeTensors and Pickle burn 40% of your CPU just to move data.\n\nThe update introduces a **Direct Pinned Memory Reader**. It allocates page-locked memory to trigger async DMA transfers directly to VRAM for PyTorch and JAX, bypassing the copy overhead and keeping CPU usage at a minimal 0.8%.\n\nI’ve also hardened the protocol with strict validation guards and optional XXH3 checksums. Bluntly, enabling checksums kills the zero-copy speed, but safety is now a configurable trade-off. With native async support for FastAPI and gRPC, Tenso is finally a transport layer that respects the hardware.\n\ngithub.com\n\n### GitHub - Khushiyant/tenso: High-performance zero-copy tensor serialization...\n\nHigh-performance zero-copy tensor serialization for Fastest Transmission",
"title": "High-performance zero-copy tensor serialization for Inference"
}