External Publication

High-performance zero-copy tensor serialization for Inference

Hugging Face Forums [Unofficial] February 10, 2026

We’re too comfortable with serialization that treats high-end silicon like a text parser. Tenso eliminates the invisible tax where formats like SafeTensors and Pickle burn 40% of your CPU just to move data.

The update introduces a Direct Pinned Memory Reader. It allocates page-locked memory to trigger async DMA transfers directly to VRAM for PyTorch and JAX, bypassing the copy overhead and keeping CPU usage at a minimal 0.8%.

I’ve also hardened the protocol with strict validation guards and optional XXH3 checksums. Bluntly, enabling checksums kills the zero-copy speed, but safety is now a configurable trade-off. With native async support for FastAPI and gRPC, Tenso is finally a transport layer that respects the hardware.

github.com

GitHub - Khushiyant/tenso: High-performance zero-copy tensor serialization...

High-performance zero-copy tensor serialization for Fastest Transmission

GitHub - Khushiyant/tenso: High-performance zero-copy tensor serialization...

Discussion in the ATmosphere