{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicyi5z5c66drjyanbnaqi7y4txrxvruf4v7lz3sdmiptp6llzphla",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mejuvloefb62"
  },
  "path": "/t/high-performance-zero-copy-tensor-serialization-for-inference/173304#post_1",
  "publishedAt": "2026-02-10T20:52:18.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "github.com",
    "GitHub - Khushiyant/tenso: High-performance zero-copy tensor serialization..."
  ],
  "textContent": "We’re too comfortable with serialization that treats high-end silicon like a text parser. Tenso eliminates the invisible tax where formats like SafeTensors and Pickle burn 40% of your CPU just to move data.\n\nThe update introduces a **Direct Pinned Memory Reader**. It allocates page-locked memory to trigger async DMA transfers directly to VRAM for PyTorch and JAX, bypassing the copy overhead and keeping CPU usage at a minimal 0.8%.\n\nI’ve also hardened the protocol with strict validation guards and optional XXH3 checksums. Bluntly, enabling checksums kills the zero-copy speed, but safety is now a configurable trade-off. With native async support for FastAPI and gRPC, Tenso is finally a transport layer that respects the hardware.\n\ngithub.com\n\n### GitHub - Khushiyant/tenso: High-performance zero-copy tensor serialization...\n\nHigh-performance zero-copy tensor serialization for Fastest Transmission",
  "title": "High-performance zero-copy tensor serialization for Inference"
}