Dedicated CPU Inference Endpoint returns empty HTTP 500 after ~80s: is there a configurable request timeout?
Hugging Face Forums [Unofficial]
April 16, 2026
Thanks for the detailed investigation! Based on your findings, here are a few things to check that might resolve the 80s timeout:
Adjust Request Timeout Settings :
- In the
huggingface_hublibrary, ensure thetimeoutparameter is set higher than 80s when initializing theInferenceApiclient. - Example:
InferenceApi(repo_id=..., timeout=120)
- In the
Verify Container Resource Limits :
- Although memory usage is low, confirm if the CPU cores allocated are sufficient for your workload. Sometimes CPU throttling can cause unexpected halts.
Check Server-side Logs :
- If possible, enable DEBUG level logs on the endpoint side to see if there is a silent exception being caught that isn’t visible in the standard 500 error message.
Hope this helps fix the empty 500 response issue!
Discussion in the ATmosphere