External Publication
Visit Post

Best Practices for Handling User Identity in Custom Model Serving (MCP)

Hugging Face Forums [Unofficial] March 25, 2026
Source

Is there no clean solution at the moment…?


This is a real gap, and other teams are running into the same thing.

The short version is:

MCP now has a much clearer standard for authenticating the client to the MCP server. It does not yet have one fully settled, built-in standard for end-user identity propagation through tool execution and downstream APIs. The safe pattern today is to authenticate at the edge, derive a verified server-side principal, and then propagate verified identity context or a new downstream token , not the original raw user token everywhere. (Model Context Protocol)

Why this feels messy

There are really three separate problems hiding inside one question.

First, there is transport authorization : can the host client call the MCP server at all? MCP now answers that with OAuth-style authorization for HTTP transports, including Protected Resource Metadata discovery and resource-bound tokens. (Model Context Protocol)

Second, there is execution identity : when a tool runs, which human user and tenant is it acting for? That is the part you care about for tools like get_balance and get_order_status. MCP discussions show this is still an active area, especially in multi-user setups where one agent serves many end-users and each end-user may have different external tokens or permissions. (GitHub)

Third, there is downstream delegation : if your model server or tool runtime needs to call another API, what token or credential should it use there? MCP’s current security guidance is explicit that blind token passthrough is the wrong answer. (Model Context Protocol)

That is why your experiments feel unsatisfying. Each workaround tends to solve only one of those three layers.


What MCP standardizes today

For the client → MCP server hop, the current MCP authorization spec is already fairly strong.

The client is expected to request a token for the specific MCP server resource, using the OAuth resource parameter, and the MCP server must validate that the token was specifically issued for that server. The spec also points to the security guidance explaining why audience validation matters and why token passthrough is forbidden. (Model Context Protocol)

That means this part is now standard enough:

  • the host client authenticates,
  • it obtains a token for the MCP server,
  • it presents that token to the MCP server,
  • the MCP server validates it as the intended resource. (Model Context Protocol)

So if your question is “is there a standardized way to pass a token from the host client to the model server,” the answer is yes, for the MCP server itself. (Model Context Protocol)

But that does not mean the same token should keep flowing to every internal tool or downstream API.


What MCP does not fully standardize yet

What is still unsettled is the next layer: how a multi-user MCP deployment should represent the end-user and delegated actor inside tool execution and downstream service calls.

The clearest public evidence is in the ongoing GitHub discussions. Discussion #234 is explicitly about multi-user authorization where one agent serves many end-users and each user may have tokens for multiple external services. Discussion #483 says the lack of a standard for per-user credentials and downstream authorization leads to inconsistent client behavior and security gaps. Discussion #804 proposes a gateway-based authorization model, which is another sign the community sees this as an architecture problem not yet fully solved by the core protocol alone. (GitHub)

There is also a newer open issue in the auth extension work, ext-auth #13, which says that even when a downstream system sees sub and client_id, it may still be unclear whether the request is coming directly from the user or from an autonomous or semi-autonomous agent acting on the user’s behalf. That is an important clue: the remaining gap is not just “which token do I send,” but also “how do I represent user versus agent clearly for authorization and audit.” (GitHub)

So the honest answer to your third question is:

yes, there are active discussions and extensions, but no single, final, built-in MCP-native answer yet for full end-user propagation across tool execution and downstream APIs. (GitHub)


The safest practical pattern today

If I were designing your system now, I would use this pattern:

1. Authenticate the host client to the MCP server

The host client gets a token specifically for your MCP server and sends that to the MCP server. The MCP server validates the token as being intended for itself. This is the clean, standardized part of the flow. (Model Context Protocol)

2. Convert that token into a trusted server-side principal

After validation, stop thinking in terms of “pass this JWT everywhere.” Instead, derive a server-owned principal object such as:

{
  "sub": "user_123",
  "tenant_id": "acme",
  "client_id": "webapp-prod",
  "scopes": ["orders:read", "balance:read"],
  "trace_id": "req_789"
}

This is not spelled out as a specific MCP object in the spec, but it is the natural architecture implied by current MCP guidance: the server validates the external token, then uses that validated identity internally rather than treating the token itself as the internal API contract. That direction also matches the “treat the MCP server as an OAuth resource server” discussion in issue #205. (GitHub)

3. Inject identity into tools from trusted context, not from the model

For personalized tools, the safest interface is not:

get_balance(user_id)
get_order_status(user_id, order_id)

It is closer to:

get_balance()
get_order_status(order_id)

with the runtime injecting the verified principal from the request context.

Why? Because once user_id becomes model-authored tool input, identity is no longer coming from the auth layer. The model should choose the operation and the business parameters. It should not choose the caller identity. That is exactly the kind of multi-user ambiguity the current MCP discussions are trying to address. (GitHub)

4. Authorize inside the tool using subject, tenant, and scope

A secure tool should check:

  • whether the caller has the right coarse scope,
  • whether the object belongs to the same tenant,
  • whether the subject is allowed to see that particular record.

That matters because scopes alone are coarse. Fine-grained authorization usually still needs subject- and resource-level checks, which is also why proposals like #483 exist. (GitHub)


Why raw token passthrough is the wrong default

This is the most important security point.

MCP’s security guidance defines token passthrough as the anti-pattern where an MCP server accepts a token from the client and passes it on to a downstream API without proper validation and without using the right audience or resource boundaries. The guidance calls out risks like security control circumvention and audit/accountability problems. (Model Context Protocol)

The spec reinforces that by saying MCP clients must use the resource parameter and MCP servers must validate that tokens were issued specifically for them. It explicitly points to the security guide for why token passthrough is forbidden. (Model Context Protocol)

So the correct distinction is:

  • Client → MCP server with a token meant for the MCP server: good. (Model Context Protocol)
  • MCP server → downstream API using that same token unchanged: bad default. (Model Context Protocol)

What to do for internal APIs versus external APIs

This split matters a lot.

Internal APIs you own

For tools like get_balance or get_order_status, where the downstream system is your own service, the simplest clean pattern is:

  • authenticate the user at the MCP boundary,
  • derive the principal,
  • propagate the principal internally,
  • authorize based on sub, tenant_id, roles, and scopes.

In larger systems, a gateway can mint a short-lived signed internal assertion that backend services trust. The MCP gateway authorization discussion is moving in this direction. (GitHub)

External APIs or third-party services

For third-party access, MCP’s current guidance is even stricter. The Elicitation spec says third-party credentials must not transit through the MCP client, the MCP server must not use the client’s credentials for that third-party service, and the user must authorize the MCP server directly. It also says the MCP server is responsible for storing and managing those third-party tokens. (Model Context Protocol)

That means if your tool needs, for example, Google, GitHub, or Slack access on behalf of the user, the right pattern is:

  • the user authorizes the MCP server directly,
  • the MCP server stores the resulting tokens,
  • the host client never acts as a generic token relay. (Model Context Protocol)

Where token exchange fits

Once you have a middle-tier service that needs to call another API, the standard OAuth answer is usually token exchange or an on-behalf-of flow.

RFC 8693 defines OAuth 2.0 Token Exchange and explains that the client can request a token for a specific downstream resource. It also defines the act claim to represent the current actor in a delegation chain, and allows nested act claims when multiple delegated hops are involved. (IETF Datatracker)

Microsoft’s On-Behalf-Of flow is a practical example of the same idea: the client sends token A to API A, API A requests token B for API B, and API B receives token B, not token A. (Microsoft Learn)

That model maps very well to MCP-like deployments:

  • token A is for your MCP server,
  • token B is for the downstream API,
  • the two tokens are not interchangeable. (Model Context Protocol)

So if you eventually need true delegated downstream calls, token exchange is the clean standards-based direction.


Evaluating the workarounds you listed

Injecting identity data as tool input

This is the weakest pattern for identity itself.

It is fine for normal business parameters. It is not ideal for security-critical identity, because it lets the least trustworthy layer in the chain shape who the tool acts as. For personalized tools, identity should come from the validated request context, not from model-generated arguments. That is the underlying concern behind the multi-user authorization discussions. (GitHub)

Custom pass-through headers

This can be acceptable inside a trusted internal boundary, but only if those headers are set by your gateway or MCP server after validation, not by the host client as an authority signal.

In other words, custom headers are okay as an internal transport detail. They are not a clean standard for end-user identity propagation across trust boundaries. The gateway-based authorization discussion points toward signed internal assertions instead of loose pass-through metadata. (GitHub)

Maintaining a session map on the server side

Useful as a correlation mechanism. Not ideal as your primary identity proof.

A session map can help you remember “this conversation is associated with this principal,” but the auth decision should still come from validated tokens or trusted assertions. MCP’s auth tutorial also explicitly warns that Mcp-Session-Id is untrusted input and should not be tied to authorization. (Model Context Protocol)


What I would recommend concretely

For your exact case, I would implement this:

Today

Use the current MCP authorization model for the host client → model server hop. Require the host client to present a token with your MCP server as the intended resource, and validate it strictly. (Model Context Protocol)

Inside your server

Derive a trusted principal object and attach it to request context. Do not let tools accept user_id as a model-authored argument for personalized endpoints. Use the principal inside the handler instead. (Model Context Protocol)

For internal service calls

Prefer a gateway or middleware layer that emits a short-lived internal signed assertion or equivalent trusted context rather than reusing the original client token. This is where the gateway-based authorization direction is most useful. (GitHub)

For third-party service calls

Do not pass the host client’s external tokens through the MCP layer. Have the user authorize the MCP server directly, and let the MCP server store and manage those provider tokens. (Model Context Protocol)

For advanced delegation

If you need multi-hop API access, look at RFC 8693 token exchange and OBO-style flows rather than token passthrough. (IETF Datatracker)


Are there plans for built-in identity support?

There are signs of movement, but the ecosystem is still evolving.

The official auth-related extensions now include Enterprise-Managed Authorization and OAuth Client Credentials. Enterprise-managed auth adds an enterprise IdP-centered flow using identity assertions and ID-JAG exchange. OAuth Client Credentials adds machine-to-machine authentication for MCP when there is no user present. (Model Context Protocol)

There are also active proposals and discussions around:

  • multi-user authorization and fine-grained resource control, (GitHub)
  • gateway-based authorization, (GitHub)
  • cryptographic client identity verification via clientId and clientAuth JWTs in SEP #1289 , which helps identify the calling software, (GitHub)
  • and the still-open question of distinguishing human users from agents acting for them, as raised in ext-auth #13. (GitHub)

So the answer is: yes, there is active work, but no single finished built-in identity-propagation standard yet for every layer of this problem. (GitHub)


Bottom line

The cleanest current answer is:

Use standard OAuth-based auth for the host client to authenticate to the MCP server. At the MCP server boundary, validate that token and turn it into a trusted server-side principal. Use that principal for personalized tools. Do not let the model choose identity, and do not blindly pass the original token through to downstream APIs. For downstream calls, use either trusted internal assertions or a separate downstream token obtained via a delegation flow such as token exchange or OBO. (Model Context Protocol)

That is the closest thing to a clean, production-safe best practice right now.

Discussion in the ATmosphere

Loading comments...