What are all the files that are being downloaded?
A common reason for extra storage consumption is caching.
However, with major models like openai/gpt-oss-20b, it’s common for separate files for multiple platforms to be stored within the repository. Downloading the entire repository means all of these get downloaded…
hf download openai/gpt-oss-20b downloads a full snapshot of the repo (i.e., every file in the model repo), not “just one set of weights”. For openai/gpt-oss-20b, the repo contains multiple full-weight artifacts (hence ~41.3GB total).
Files in openai/gpt-oss-20b
Repo root
.gitattributesLICENSEREADME.mdUSAGE_POLICYchat_template.jinjaconfig.jsongeneration_config.jsonmodel-00000-of-00002.safetensorsmodel-00001-of-00002.safetensorsmodel-00002-of-00002.safetensorsmodel.safetensors.index.jsonspecial_tokens_map.jsontokenizer.jsontokenizer_config.json(Hugging Face)
metal/
metal/model.bin(13.8GB) (Hugging Face)
original/
original/config.jsonoriginal/dtypes.jsonoriginal/model.safetensors(13.8GB) (Hugging Face)
Why this becomes ~40–50GB
This repo includes three large “model-weight” payloads:
- Sharded safetensors in the root (
model-0000*-of-*.safetensors) totaling ~13.8GB (Hugging Face) - A single-file safetensors copy under
original/model.safetensors(~13.8GB) (Hugging Face) - A precompiled Metal binary under
metal/model.bin(~13.8GB) intended for Apple Metal runtimes (Hugging Face)
That’s already 41.4GB before small metadata/tokenizer files, which matches the repo size shown on the “Files” tab (41.3GB). (Hugging Face)
If you want to avoid downloading everything
Use --include/--exclude patterns. (Hugging Face)
Examples:
Download only the “original” weights (minimal set recommended in OpenAI’s gpt-oss repo docs):
hf download openai/gpt-oss-20b --include "original/*" --local-dir gpt-oss-20b/
(GitHub)
Download everything except the Metal and original copies (keep only the root sharded safetensors + configs/tokenizer):
hf download openai/gpt-oss-20b --exclude "metal/*" --exclude "original/*"
(Hugging Face)
Discussion in the ATmosphere