Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibcqi2l5g6q2sopjp46ct475kvy65ab3fjh2ajtjjbbwzohepgiqm",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3moyhantaubm2"
  },
  "path": "/t/llama-3-1-70b-api-access/177106#post_2",
  "publishedAt": "2026-06-23T22:22:51.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "https://huggingface.co/models?inference_provider=all",
    "meta-llama/Llama-3.1-70B",
    "meta-llama/Llama-3.1-70B-Instruct",
    "models with Inference Providers filter",
    "Hub API for Inference Providers",
    "Featherless AI on Hugging Face Inference Providers",
    "(click for more details)"
  ],
  "textContent": "There’s been a lot of confusion around how Inference Providers are supposed to be used:\n\n* * *\n\nI don’t think “You have been granted access to this model” necessarily contradicts “Model not supported by provider featherless-ai”.\n\nThe short version is:\n\nCheck | What it means\n---|---\n“You have been granted access” on the model page | You have access to the gated model repo / weights / model page resources.\nThe browser widget works | Some provider/path available from the widget could run something for that page. It does not necessarily prove your third-party app is using the same provider, model id, task, token scope, or endpoint.\n`Model not supported by provider featherless-ai` | The selected provider, here `featherless-ai`, may not currently expose the exact model id and task that your API call is asking for.\n\nSo I would first check the **exact model id + provider + task** combination before debugging the token or curl syntax too much.\n\nThe quickest first check is the model search page with the Inference Providers filter:\n\nhttps://huggingface.co/models?inference_provider=all\n\nThen search for the exact model id and, if needed, narrow the provider filter to Featherless. If the exact model/provider combination is not listed there, changing the curl call probably will not make that provider serve the model.\n\nAlso, one subtle point: `meta-llama/Llama-3.1-70B` and `meta-llama/Llama-3.1-70B-Instruct` are not interchangeable.\n\n  * meta-llama/Llama-3.1-70B is the base/pretrained model.\n  * meta-llama/Llama-3.1-70B-Instruct is the instruction-tuned/chat-oriented model.\n\n\n\nIf your third-party app is making chat-completion-style calls, I would first verify whether the **Instruct** variant is available through the provider you are trying to use, rather than assuming that access to the base repo means the provider can serve it through chat completions.\n\nA practical order of checks would be:\n\n  1. Confirm the exact model id:\n\n     * `meta-llama/Llama-3.1-70B`\n     * or `meta-llama/Llama-3.1-70B-Instruct`\n     * or some provider-side alias such as `meta-llama/Meta-Llama-3.1-70B-Instruct`\n  2. Check whether that exact model is currently exposed through Inference Providers:\n\n     * UI: models with Inference Providers filter\n     * API: Hub API for Inference Providers\n  3. If you are explicitly forcing Featherless, try not forcing it:\n\n     * use `provider=\"auto\"` in `huggingface_hub`, or\n     * remove the `:featherless-ai` suffix if you are using the OpenAI-compatible router model name.\n  4. If it works with `auto` but fails with `featherless-ai`, that suggests a provider-specific availability/mapping issue, not a general Llama access issue.\n\n  5. Check the local client version if you are using Python:\n\n         python -c \"import huggingface_hub; print(huggingface_hub.__version__)\"\n\n\nFeatherless’ HF integration post says to use `huggingface_hub` v0.33.0 or newer:\n\nFeatherless AI on Hugging Face Inference Providers\n\n  6. If you still get the error, the useful info to post back would be:\n\n     * exact model id\n     * exact endpoint URL\n     * whether you are using `provider=\"featherless-ai\"`, `:featherless-ai`, or `auto`\n     * full error message\n     * `huggingface_hub` version, if applicable\n     * whether the app is using chat completion, text generation, or an older Inference API endpoint\n\nWhy this error can happen even when model access is granted (click for more details) How to check the provider mapping more precisely (click for more details) Base model vs Instruct model (click for more details) Provider selection: auto vs forcing Featherless (click for more details) Client version can change the symptom (click for more details) Task mismatch examples (click for more details) Legacy endpoint check (click for more details) Minimal curl shape to compare (click for more details)\n\nTo summarize my guess: this is probably not “you do not have access to Llama” in the simple gated-repo sense. It is more likely one of these:\n\n  1. the exact base model id is not exposed by the selected provider,\n  2. the third-party app is forcing `featherless-ai`,\n  3. the app is calling the wrong task,\n  4. the local SDK is old,\n  5. or the provider mapping/catalog and the runtime availability are temporarily out of sync.\n\n\n\nThe first thing I would rule out is the non-fixable case: **is the exact model id currently available through the provider you are forcing?**",
  "title": "Llama 3.1 70B API access?"
}