{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiaecnf5l5ppm5xvolu2fggw2zci6l4m5edagbf4cedofzixokyzhe",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mnxpunlfkmd2"
},
"path": "/t/how-to-ensure-safe-usage/176678#post_5",
"publishedAt": "2026-06-10T21:07:51.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "Hi, not an expert but would like to share my suggestion for reference.\n\nSince your goal is to prevent accidental execution rather than stopping determined malicious users, you can handle this locally without needing complex sandboxes.\n\n 1. **Network-level filter via`HF_ENDPOINT`**: The `huggingface_hub` library respects the `HF_ENDPOINT` environment variable. You can set this globally for all SSH users (e.g., in `/etc/profile.d/hf.sh`) to point to a lightweight proxy (like a simple Nginx config) running locally on the server. Have the proxy forward all requests to `https://huggingface.co`, but return a 403 for file extensions associated with pickled code (`.bin`, `.pt`, `.pth`, `.pkl`). This automatically restricts users to downloading `.safetensors` and config files.\n\n 2. **Python-level guardrail via`sitecustomize.py`**: If you want to catch the execution itself, you can add a `sitecustomize.py` file to your server’s global Python environment. Because this script executes automatically on Python startup, you can use it to monkey-patch `torch.load` to raise a custom warning or error. This effectively acts as a tripwire, reminding users to pass `use_safetensors=True` when loading models via `transformers`.\n\n\n\n\nRelying on `safetensors` is exactly the right instinct, you just need to enforce it at the proxy or interpreter level.\n\nVERIFY BEFORE POSTING:\n\n * Verify that all download traffic from `huggingface_hub` strictly routes through `HF_ENDPOINT` in the version your users are running (it generally does, but it is worth testing in your specific environment).\n\n * If you go the `sitecustomize.py` route, ensure monkey-patching `torch.load` won’t break users’ legitimate local training workflows (e.g., saving and loading their own optimizer states or mid-training checkpoints, which often default to pickle).\n\n\n",
"title": "How to ensure safe usage?"
}