{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreidyvg6sncstjxlaq7r75qtdis25aavarj2t4setkasdmgtwz4cwoq",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mjbsu5tlhwj2"
},
"path": "/t/huggingface-dataset-download-stuck-in-kaggle/175183#post_2",
"publishedAt": "2026-04-12T06:50:29.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"Hugging Face",
"GitHub",
"Kaggle"
],
"textContent": "> huggingface_hub up-to-date\n\nAccording to the documentation, when installing the relatively new `huggingface_hub`, `hf_xet` is supposed to be installed at the same time, and this usually works fine. However, Xet-related drift still occurs occasionally. Judging from the screenshot, this appears to be a symptom of such an issue.\n\nSetting the environment variable `HF_HUB_DISABLE_XET=1` is the simplest way to work around or isolate the problem.\nConsidering download speed, the ideal solution would be to resolve the issue using `pip install -U hf_xet`.\n\nIf the issue cannot be bypassed by setting `HF_HUB_DISABLE_XET=1`, there may be an unknown bug.\n\n* * *\n\nThe most likely cause is **not your token** and **not your`hf_hub_download(...)` call itself**. It is the **download path underneath it**. Hugging Face now uses **Xet** for Hub file transfers by default, and current docs say all Hub repositories are Xet-enabled, `hf_xet` is the default transfer path, and as of `huggingface_hub` 0.32.0 installing the latest `huggingface_hub` also installs `hf_xet`. That means “it worked a few days ago, then started hanging without code changes” is plausible because the backend path may have changed even though your Python call did not. (Hugging Face)\n\n## What is probably happening\n\n`hf_hub_download()` does not simply save a file directly into the folder you are watching. Hugging Face documents that it **downloads into the HF cache** and returns a path that points into that cache. It also documents a separate **Xet cache** under `HF_XET_CACHE`. So if you are only watching one Kaggle disk indicator or one folder, you may be missing where activity is actually happening. At the same time, there are current reports of **real Xet download stalls** , so this is not just a visualization problem either. (Hugging Face)\n\nThe closest match to your environment is a current GitHub issue in `huggingface/xet-core` where **downloads on Kaggle** get stuck and the reporter explicitly suspects Xet rather than the dataset code. There are also other Xet issues where **large files stick at 0%** or **near 99%**. That makes your case look like a real backend or environment interaction problem, not a user error. (GitHub)\n\n## Why the progress bar can look frozen\n\nThere is a very recent `huggingface_hub` issue showing that **Xet downloads barely report progress** , so a large transfer can look dead for long stretches even if bytes are moving. The report says the bar may only jump a few times on a multi-GB file, and points to fixes in both `xet-core` and `huggingface_hub`. So a “stuck” bar is not always proof of a stalled transfer. (GitHub)\n\nBut your extra detail matters: you said Kaggle storage did not increase. That makes a **pure progress-bar-only explanation weaker**. In your case, the most likely reading is: either the transfer is really hanging, or the writes are happening in a cache location different from the one you are watching. (Hugging Face)\n\n## Why Kaggle is a good suspect\n\nThere is a Kaggle-side product feedback report about **Hugging Face downloads failing because of Kaggle proxy URL rewriting**. The search result snippet also says that if it persists, it is likely a Kaggle-side problem. That is not definitive proof for your exact failure, but it is strong context that Kaggle networking or proxying has already caused Hugging Face download breakage before. (Kaggle)\n\nThere is also a fresh `huggingface_hub` issue from **Colab** where large HF downloads hang while plain `wget` downloads work, which is useful context because it suggests this class of bug can appear in **managed notebook environments** specifically, not just on your machine. (GitHub)\n\n## My ranked diagnosis for your case\n\n### 1. Most likely: Xet-backed transfer hanging in Kaggle\n\nThis best fits the timing, the environment, and the public issue reports. Hugging Face’s newer transfer path uses Xet by default, and there is already a Kaggle-specific Xet issue with a “gets stuck” symptom. (Hugging Face)\n\n### 2. Also likely: a broader Xet large-file stall, exposed more easily on Kaggle\n\nThere are public reports of Xet downloads sticking at 0% and 99% on larger files. That matches your symptom even if Kaggle is only part of the trigger. (GitHub)\n\n### 3. Possible contributor: poor progress reporting\n\nThis can make the stall look worse than it is, but by itself it does not fully explain “no visible disk movement.” (GitHub)\n\n### 4. Possible contributor: Kaggle proxy or networking layer\n\nThere is direct evidence that Kaggle proxy behavior has interfered with HF downloads before. (Kaggle)\n\n### 5. Less likely: your authentication or function arguments\n\nIf auth were the main issue, the normal failure would usually be a clearer 401, 403, missing file, or repository error rather than an indefinite hang. The function itself is the documented standard way to download a single file. (Hugging Face)\n\n## Best fixes to try, in order\n\n### 1. Disable Xet first\n\nThis is the highest-value test.\n\nSet these **before importing`huggingface_hub`**. Hugging Face explicitly says environment variables are read **at import time** , not afterward. It also documents `HF_HUB_DISABLE_XET`, `HF_HUB_DOWNLOAD_TIMEOUT`, and `HF_HUB_ETAG_TIMEOUT`, with both timeouts defaulting to **10 seconds**. (Hugging Face)\n\n\n import os\n\n # Must be set BEFORE importing huggingface_hub\n os.environ[\"HF_HUB_DISABLE_XET\"] = \"1\"\n os.environ[\"HF_HUB_DOWNLOAD_TIMEOUT\"] = \"120\"\n os.environ[\"HF_HUB_ETAG_TIMEOUT\"] = \"30\"\n\n # Put cache somewhere explicit on Kaggle\n os.environ[\"HF_HOME\"] = \"/kaggle/working/hf_home\"\n\n # Useful for debugging\n os.environ[\"HF_DEBUG\"] = \"1\"\n os.environ[\"HF_HUB_VERBOSITY\"] = \"debug\"\n\n\nThen:\n\n\n from huggingface_hub import hf_hub_download\n\n path = hf_hub_download(\n repo_id=self.repo_id,\n filename=file_path,\n repo_type=\"dataset\",\n cache_dir=\"/kaggle/working/hf_cache\",\n force_download=True,\n )\n\n print(path)\n\n\n**Why this is the best first test:** if the problem disappears with `HF_HUB_DISABLE_XET=1`, your root cause is very likely the Xet path, not the repo, not the token, and not your Kaggle notebook code. That diagnosis is grounded in the Xet-related stuck-download issues and the fact that HF now defaults to Xet transfers. (Hugging Face)\n\n### 2. Make the download visible in a real folder, not just the cache\n\nHugging Face docs say `hf_hub_download()` normally returns a pointer into the cache, and they also document a `local_dir` mode for downloading to a specific folder while maintaining metadata under `.cache/huggingface`. On Kaggle, the documented persisted output area is `/kaggle/working`. (Hugging Face)\n\nSo for debugging, try writing somewhere explicit:\n\n\n from huggingface_hub import hf_hub_download\n\n path = hf_hub_download(\n repo_id=self.repo_id,\n filename=file_path,\n repo_type=\"dataset\",\n local_dir=\"/kaggle/working/hf_files\",\n cache_dir=\"/kaggle/working/hf_cache\",\n force_download=True,\n )\n\n print(path)\n\n\nThis does two things:\n\n * it makes actual file writes easier to observe\n * it removes confusion around where the cache lives (Hugging Face)\n\n\n\n### 3. Use a dry run to separate “metadata works” from “payload transfer hangs”\n\nHugging Face documents `dry_run=True` for `hf_hub_download()` and `snapshot_download()`. It returns file info without performing the full transfer. (Hugging Face)\n\n\n from huggingface_hub import hf_hub_download\n\n info = hf_hub_download(\n repo_id=self.repo_id,\n filename=file_path,\n repo_type=\"dataset\",\n cache_dir=\"/kaggle/working/hf_cache\",\n dry_run=True,\n )\n\n print(info)\n\n\nInterpret it like this:\n\n * **If dry run works but the real download hangs** , auth and repo resolution are probably fine, and the problem is in the actual transfer path.\n * **If dry run fails too** , the problem may be earlier in the flow. (Hugging Face)\n\n\n\n### 4. If you must keep Xet enabled, reduce its aggressiveness\n\nHugging Face documents `HF_XET_NUM_CONCURRENT_RANGE_GETS`, which controls how many byte ranges per file are fetched concurrently, with a default of 16. On a managed notebook or proxy-heavy environment, reducing concurrency can help. (Hugging Face)\n\n\n import os\n os.environ[\"HF_XET_NUM_CONCURRENT_RANGE_GETS\"] = \"4\"\n os.environ[\"HF_HUB_DOWNLOAD_TIMEOUT\"] = \"120\"\n os.environ[\"HF_HUB_ETAG_TIMEOUT\"] = \"30\"\n os.environ[\"HF_HOME\"] = \"/kaggle/working/hf_home\"\n\n\nThen retry in a fresh session. I would still test **disable Xet first** , because it is the cleaner diagnostic step. (Hugging Face)\n\n### 5. Do not rely on old `hf_transfer` advice\n\nCurrent Hugging Face docs say `HF_HUB_ENABLE_HF_TRANSFER` is deprecated because the Hub is now fully powered by Xet. So older advice built around `hf_transfer` is no longer the main path. (Hugging Face)\n\n## One important caveat\n\nIf the file you are downloading is **larger than about 50 GB** , disabling Xet may not be a viable fallback right now. There is an open `huggingface_hub` issue showing that with `HF_HUB_DISABLE_XET=1`, HTTP-based download of a ~59 GB file fails before starting and reports that the file is too large for the regular method. (GitHub)\n\nSo:\n\n * for files **under** that threshold, disabling Xet is the best first test\n * for files **over** that threshold, you may need Xet to work, which makes Kaggle/Xet compatibility more important (GitHub)\n\n\n\n## What I would do in your exact situation\n\nI would do this in a **fresh Kaggle session** :\n\n 1. Set `HF_HUB_DISABLE_XET=1` before importing anything from `huggingface_hub`. (Hugging Face)\n 2. Set `HF_HUB_DOWNLOAD_TIMEOUT=120` and `HF_HUB_ETAG_TIMEOUT=30`. Defaults are 10 seconds, which is not generous for fragile notebook networking. (Hugging Face)\n 3. Set `HF_HOME`, `cache_dir`, and optionally `local_dir` to `/kaggle/working/...` so you can see where bytes are supposed to go. Kaggle documents `/kaggle/working` as the saved output area. (Kaggle)\n 4. Run `dry_run=True`. If that succeeds, your repo, filename, and auth are probably fine. (Hugging Face)\n 5. Retry the real download.\n 6. If it only works with Xet disabled, you have your answer: **Kaggle + Xet path**.\n 7. If it still hangs even with Xet disabled, then the next suspects are Kaggle networking, proxy rewriting, or a broader transient Hugging Face routing issue. (Kaggle)\n\n\n\n## Bottom line\n\nYour case is most consistent with a **recent backend-path regression** , not a mistake in your code. The strongest explanation is:\n\n * `huggingface_hub` now routes downloads through **Xet** by default\n * there are current reports of **Kaggle-specific Xet failures**\n * there are separate reports of **0% / 99% stalls** on large Xet downloads\n * there is a real **progress-reporting bug** that can make healthy transfers look frozen\n * Kaggle has also had **proxy-related HF download issues** before (Hugging Face)\n\n\n\nSo the cleanest first move is:\n\n**disable Xet, raise timeouts, and write into`/kaggle/working` in a fresh session.**",
"title": "Huggingface Dataset Download Stuck in Kaggle"
}