{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiazciik5sftsmxdehm64p5zq5ihnwy2vz5zaeq2qtiky4oabc36de",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mh64za74mft2"
  },
  "path": "/t/503-cant-rebuild-my-space-its-always-paused-or-503/174302#post_2",
  "publishedAt": "2026-03-16T06:41:43.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "Hugging Face",
    "Streamlit Document",
    "AWS Documentation",
    "GitHub",
    "Hugging Peace Status"
  ],
  "textContent": "If asking Claude doesn’t resolve the issue, there’s a possibility that HF is intentionally blocking or mistakenly blocking the space (in which case you’ll have to contact HF support website@huggingface.co), but with Streamlit, there are quite a few cases where the problem stems from configuration errors in the `Dockerfile` or `README.md`.\n\n* * *\n\nMost likely, your Space is launching a Streamlit process, but Hugging Face never marks the Space as a healthy HTTP target. In practice that usually means one of these: the Space metadata points HF at the wrong port, Streamlit is bound or configured in a proxy-hostile way, startup never reaches a healthy state, or the Space record itself is wedged. HF’s Docker docs say Docker Spaces expose one external app port via `app_port`, with `7860` as the default example, while the legacy built-in Streamlit SDK is a separate path that only allows port `8501`. HF’s official Streamlit Docker template also shows a very specific healthy shape: `sdk: docker`, `app_port: 8501`, `EXPOSE 8501`, a health check on `/_stcore/health`, and `--server.address=0.0.0.0`. (Hugging Face)\n\n## What your four facts imply\n\nYour facts narrow the problem a lot:\n\n  * the container logs show Streamlit starting\n  * the browser gets `503` on every request\n  * the Space often looks paused or un-restartable\n  * you have a load-balancer trace root id\n\n\n\nThat pattern is more consistent with **routing or health failure** than with ordinary app-code exceptions. Streamlit’s deployment guide says that when Streamlit appears to be running remotely but the app does not load, the most likely cause is that the Streamlit port is not actually exposed or reachable from the outside. HF’s config reference also says a Space is flagged unhealthy if startup exceeds the allowed timeout. (Streamlit Document)\n\nYour `Root ID` looks like an AWS Application Load Balancer trace id, not an app-specific error code. AWS documents that `X-Amzn-Trace-Id` is added or updated by the load balancer and can be used to trace a request through the edge and target path. That makes it useful for Hugging Face support, but not directly diagnostic on its own. (AWS Documentation)\n\n## The most likely causes, ranked\n\n### 1. Space mode or port metadata mismatch\n\nThis is my top suspect.\n\nIf your repo is a **Docker Space** , HF expects the external port to match `README.md` `app_port`. HF documents that the Docker Space default is `7860`, but that value is only correct if your app, your Dockerfile, and your runtime all actually use the same port. If your repo is still a **legacy`sdk: streamlit` Space**, HF says only `8501` is allowed. A repo that mixes those two worlds can easily produce exactly your symptom: “Streamlit started” in logs, but 503 at the browser because the proxy is checking the wrong place. (Hugging Face)\n\nThis is why `7860` in the logs is not enough by itself. `7860` is valid for Docker Spaces, but invalid for legacy built-in Streamlit Spaces. The key question is not “what port did Streamlit print,” but “does HF route to that exact port for this exact Space type.” (Hugging Face)\n\n### 2. Bind-address or health-surface problem\n\nSecond suspect.\n\nStreamlit’s config reference says `server.address` controls where the server listens. If it is set to a specific address, the app is only accessible from that address. The official HF Streamlit Docker template binds to `0.0.0.0` and defines a health check at `http://localhost:8501/_stcore/health`, which shows the shape HF expects for a routable Streamlit container. If your app is listening on `127.0.0.1` or otherwise not on the container’s public interface, logs can look healthy while the proxy still fails every request. (Streamlit Document)\n\n### 3. Streamlit reverse-proxy incompatibility\n\nThird suspect.\n\nStreamlit has open upstream issues around reverse proxies, path prefixes, and WebSocket traffic on `/_stcore/stream`. Recent examples include failures behind Istio and other proxy layers, plus problems caused by URL rewrites and subpaths. Those issues matter because Hugging Face Spaces sit behind a proxy. (GitHub)\n\nThat said, your symptom is **503 on every request** , not “HTML loads but the app hangs on skeletons.” Streamlit’s own deployment guide separates those failure modes: “never loads” usually points to exposure or reachability problems, while “keeps loading forever” more often points to CORS or WebSocket problems. That makes pure proxy/WebSocket breakage possible, but not my first guess for your case. (Streamlit Document)\n\n### 4. Slow or incomplete startup health\n\nFourth suspect.\n\nHF documents `startup_duration_timeout`, default 30 minutes, and says the Space is flagged unhealthy if startup exceeds that time. If your app downloads data, builds an index, authenticates to a private upstream, or waits on a missing secret before it can actually serve traffic, the process may still appear in logs while HF never gets a healthy app surface. (Hugging Face)\n\n### 5. Space-level HF state problem\n\nStill plausible.\n\nThere are recent public reports of Docker Spaces that stop building properly or never trigger real builds despite valid files, and there are forum cases where the same code works after recreation or duplication under a new slug. That does not prove your Space is in that category, but it is a real failure family. (GitHub)\n\n### 6. Intentional HF restriction\n\nPossible in principle. Not the default reading.\n\nHF’s Terms allow suspension or termination, and the Content Policy describes moderation decisions that can disable content or suspend an account, with an appeal path. But HF’s documented rate limiting is a `429 Too Many Requests`, not a `503`, and a plain `503` by itself is weak evidence of deliberate blocking. (Hugging Face)\n\n## What I think is happening in your case\n\nMy best inference is this:\n\n**HF is reaching the Space lifecycle, but not a healthy app endpoint.** The Streamlit process exists. The HF edge still cannot validate the Space as ready. So the browser gets a generic 503 from the proxy layer instead of your app. That inference is supported by HF’s Docker routing model, Streamlit’s remote-deployment troubleshooting, and the official Streamlit Docker template’s use of a concrete health endpoint. (Hugging Face)\n\nIf I had to rank probabilities without seeing the repo, I would put them in this order: port/SDK mismatch first, bind/health visibility second, HF Space-state issue third, Streamlit proxy edge case fourth, intentional restriction last. That ordering is an inference from the docs and public cases above. (Hugging Face)\n\n## The exact checks I would do, in order\n\n### 1. Check the README YAML first\n\nFor a Docker Space, the top of `README.md` should be internally consistent with the actual app port:\n\n\n    ---\n    sdk: docker\n    app_port: 7860\n    startup_duration_timeout: 1h\n    ---\n\n\nThat is valid only if your Dockerfile and Streamlit command also use `7860`. HF documents `sdk: docker` and `app_port`, and documents `startup_duration_timeout` as the health timeout ceiling. (Hugging Face)\n\nIf the YAML says `sdk: streamlit`, then `7860` is immediately suspicious, because HF says only `8501` is allowed for built-in Streamlit Spaces. (Hugging Face)\n\n### 2. Make the runtime use the same port everywhere\n\nFor a Docker Space on `7860`, all of these should agree:\n\n  * `README.md`: `app_port: 7860`\n  * Dockerfile: `EXPOSE 7860`\n  * startup command: `streamlit run ... --server.port=7860 --server.address=0.0.0.0`\n  * optional health check: `curl --fail http://localhost:7860/_stcore/health`\n\n\n\nHF’s official Streamlit template demonstrates that same pattern on `8501`, which is the same principle with a different port. (Hugging Face)\n\n### 3. Strip Streamlit config to the minimum safe baseline\n\nStreamlit’s config docs are clear here:\n\n  * `server.address` controls the listen address\n  * `server.port` controls the actual server port\n  * `browser.serverPort` is **not** how you change the app port\n  * `browser.serverAddress` is the browser-facing address used for URL/CORS/XSRF purposes\n  * `baseUrlPath` is only for serving under a path prefix (Streamlit Document)\n\n\n\nFor Hugging Face, the clean baseline is:\n\n\n    [server]\n    address = \"0.0.0.0\"\n    port = 7860\n    headless = true\n\n    [browser]\n    gatherUsageStats = false\n\n\nWhat I would remove first, unless you intentionally need them:\n\n  * `browser.serverAddress`\n  * `browser.serverPort`\n  * `server.baseUrlPath`\n\n\n\nThose are common “looks fine in logs, broken in browser” settings in proxied deployments. (Streamlit Document)\n\n### 4. Add a health check if you do not already have one\n\nThe official HF Streamlit Docker template uses:\n\n\n    HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health\n\n\nThat is not proof HF requires the exact same instruction in every Space, but it is a very strong known-good reference. If you are on `7860`, mirror the same pattern on `7860`. (Hugging Face)\n\n### 5. Replace the app with a trivial hello-world once\n\nUse the smallest possible app:\n\n\n    import streamlit as st\n    st.title(\"health test\")\n    st.write(\"ok\")\n\n\nIf that still 503s after the port/config cleanup, suspicion shifts away from your application code and toward either the Space record or HF-side routing/state. Streamlit’s own troubleshooting guide recommends using a simple HTTP test to separate app/config problems from platform/network problems. (Streamlit Document)\n\n### 6. Duplicate the Space to a new private slug\n\nThis is a good diagnostic, not superstition. HF officially supports duplicating Spaces, and public reports show cases where the same code works under a new slug while the original remains broken. If the duplicate works, the original Space is likely wedged or flagged. (GitHub)\n\n## How to classify the failure from the browser\n\nOpen DevTools and reload once.\n\n  * If **`/` itself** returns `503`, that points more to HF routing/health not seeing a good backend target.\n  * If `/` loads but requests to `/_stcore/stream` fail, that points more to Streamlit proxy/WebSocket behavior. Public Streamlit issues around `/_stcore/stream` and path rewrites match that pattern. (GitHub)\n\n\n\nSince you said **every request** returns `503`, my read is that the problem is earlier than the WebSocket layer. That is an inference, but it is the simpler fit. (Streamlit Document)\n\n## What I would not assume\n\nI would **not** assume this is rate limiting. HF documents rate limiting as `429`, with `RateLimit` headers and five-minute windows. That does not match your symptom. (Hugging Face)\n\nI would also **not** assume a global HF outage right now. The public status page currently shows the main services, including Spaces and Spaces Proxy, as online. That does not rule out a Space-specific bug, but it weakens the “platform-wide incident” theory. (Hugging Peace Status)\n\nI would also **not** assume intentional blocking from the `503` alone. HF can suspend or disable content, but the `503` by itself is not decisive evidence of that. (Hugging Face)\n\n## The strongest “good” baseline to compare against\n\nIf your goal is a Docker Space running Streamlit cleanly, compare your repo against this shape:\n\n\n    ---\n    sdk: docker\n    app_port: 7860\n    ---\n\n\n\n    EXPOSE 7860\n    HEALTHCHECK CMD curl --fail http://localhost:7860/_stcore/health\n    ENTRYPOINT [\"streamlit\", \"run\", \"app.py\", \"--server.port=7860\", \"--server.address=0.0.0.0\"]\n\n\n\n    [server]\n    address = \"0.0.0.0\"\n    port = 7860\n    headless = true\n\n\nThat is a direct adaptation of HF’s documented Docker model plus the official Streamlit Docker template’s known-good health and bind pattern. (Hugging Face)\n\n## If you have to escalate to Hugging Face\n\nSend one concise support note with:\n\n  * Space: `PhillyMac/deku-corpus-builder`\n  * exact UTC time of one failed request\n  * `Root ID: 1-69b734ae-6e89e03645bc568f68d02530`\n  * whether the repo is `sdk: docker` or `sdk: streamlit`\n  * whether a minimal hello-world app still 503s\n  * whether a duplicated Space works\n  * whether `/` fails directly or only Streamlit subrequests fail\n\n\n\nThe root id is useful because AWS traces can be correlated on the load-balancer side. HF’s policy pages also give you the right contacts if this turns out to be a moderation or account-state issue rather than a technical one. (AWS Documentation)\n\n## Bottom line\n\nMy best diagnosis is:\n\n**You are probably dealing with a deployment-shape mismatch, not a mysterious paused-state bug.** The most likely mismatch is between HF’s routing expectations and what your Streamlit container is actually exposing. The first place to look is the quartet of `sdk`, `app_port`, `EXPOSE`, and Streamlit `server.port/address`. If those are already perfect, then the next most likely answer is a stuck Space record, and the duplicate-space test becomes very informative. (Hugging Face)",
  "title": "503 - Can't rebuild my space - it's always paused or 503"
}