How to ensure safe usage?
Hi, not an expert but would like to share my suggestion for reference.
Since your goal is to prevent accidental execution rather than stopping determined malicious users, you can handle this locally without needing complex sandboxes.
Network-level filter via
HF_ENDPOINT: Thehuggingface_hublibrary respects theHF_ENDPOINTenvironment variable. You can set this globally for all SSH users (e.g., in/etc/profile.d/hf.sh) to point to a lightweight proxy (like a simple Nginx config) running locally on the server. Have the proxy forward all requests tohttps://huggingface.co, but return a 403 for file extensions associated with pickled code (.bin,.pt,.pth,.pkl). This automatically restricts users to downloading.safetensorsand config files.Python-level guardrail via
sitecustomize.py: If you want to catch the execution itself, you can add asitecustomize.pyfile to your server’s global Python environment. Because this script executes automatically on Python startup, you can use it to monkey-patchtorch.loadto raise a custom warning or error. This effectively acts as a tripwire, reminding users to passuse_safetensors=Truewhen loading models viatransformers.
Relying on safetensors is exactly the right instinct, you just need to enforce it at the proxy or interpreter level.
VERIFY BEFORE POSTING:
Verify that all download traffic from
huggingface_hubstrictly routes throughHF_ENDPOINTin the version your users are running (it generally does, but it is worth testing in your specific environment).If you go the
sitecustomize.pyroute, ensure monkey-patchingtorch.loadwon’t break users’ legitimate local training workflows (e.g., saving and loading their own optimizer states or mid-training checkpoints, which often default to pickle).
Discussion in the ATmosphere