Upload speeds extremely slow / stalling since April 1st
I just hit identical stalling on a 7.7 GB safetensors upload from a Linux/aarch64 box (huggingface_hub 1.14.0 + hf-xet 1.5.0). After a couple of failed attempts I dug into the source and found something I hadn’t seen called out in any tutorial:
HF_HUB_ENABLE_HF_TRANSFER=1 is a silent no-op for uploads in v1.x. The hf_transfer Rust library has been removed from the upload path entirely, replaced by hf-xet. The new “high performance” flag is HF_XET_HIGH_PERFORMANCE=1. From huggingface_hub/constants.py
I tested all the workarounds suggested earlier in this thread, plus the documented HP flag. Throughput I observed for the same 7.7 GB file:
- No flags (default Xet, standard perf): sustained ~6 Mbps, eventually stalls in CLOSE-WAIT
- HF_HUB_ENABLE_HF_TRANSFER=1 (the old advice): sustained ~6 Mbps — identical to no flags, because the flag is a no-op
- HF_XET_HIGH_PERFORMANCE=1: ~21–42 Mbps initial burst, then stalls at ~1.3 GB cumulative (matches xet-core #800 / huggingface_hub #3726)
- HF_HUB_DISABLE_XET=1 (the HTTP-fallback workaround): ~3 Mbps single-stream LFS — slow but stable
- HF_XET_FIXED_UPLOAD_CONCURRENCY=1 (the bypass-adaptive-controller workaround): ~2.7 Mbps sustained, stable, no CLOSE-WAIT — but functionally equivalent to HF_HUB_DISABLE_XET=1 in throughput
So both workarounds in this thread effectively trade “fast but stalls” for “stable but slow” — which still leaves multi-GB uploads at multi-hour ETAs on residential connections. There’s currently no fast + stable upload path for first-time large content uploads where Xet has nothing to dedup against.
How to confirm which path is active during your upload — run ss -tnp and look at the destination IPs:
- 34.107.x.x / 34.149.x.x / 160.79.x.x → Xet
- *.cloudfront.net → vanilla LFS
Two doc/UX changes that would have saved me hours:
- Mention HF_XET_HIGH_PERFORMANCE=1 in the basic hf upload quickstart, not just buried in the “Tips and tricks for large uploads” section.
- Make the deprecation warning fire whenever HF_HUB_ENABLE_HF_TRANSFER=1 is set, not only when both flags are set together.
I’ll open a separate issue on huggingface_hub for the warning + docs visibility. The actual stalling bug for HP mode looks like it’s already tracked under xet-core #800.
Discussion in the ATmosphere