External Publication

Dedicated CPU Inference Endpoint returns empty HTTP 500 after ~80s: is there a configurable request timeout?

Hugging Face Forums [Unofficial] April 16, 2026

Thanks for the detailed investigation! Based on your findings, here are a few things to check that might resolve the 80s timeout:

Adjust Request Timeout Settings :
- In the huggingface_hub library, ensure the timeout parameter is set higher than 80s when initializing the InferenceApi client.
- Example: InferenceApi(repo_id=..., timeout=120)
Verify Container Resource Limits :
- Although memory usage is low, confirm if the CPU cores allocated are sufficient for your workload. Sometimes CPU throttling can cause unexpected halts.
Check Server-side Logs :
- If possible, enable DEBUG level logs on the endpoint side to see if there is a silent exception being caught that isn’t visible in the standard 500 error message.

Hope this helps fix the empty 500 response issue!