{
"$type": "site.standard.document",
"content": "---\ntitle: \"The road to COMP4020: token management by proxy\"\ndescription: \"Designing a proxy to sit between students and the Claude API: per-student\n quotas, full-traffic logging, and a safety net for leaked keys.\"\ntags: [comp4020]\n---\n\n:::tip\n\nThis post is part of a series I'm writing as I develop\n[COMP4020: Agentic Coding Studio](/blog/2025/12/19/comp4020-rapid-prototyping-for-the-web/).\nSee [all posts in the series](/blog/tag/comp4020/). This one is a direct\nfollow-up to\n[managing the strategic token reserve](/blog/2026/03/27/managing-the-strategic-token-reserve/)---now\nthat\n[Anthropic have come to the party](/blog/2026/04/02/anthropic-comes-to-the-party/),\nwe can design the quota-enforcement tooling against the actual controls their\nAPI provides.\n\n:::\n\nSo: we have a $500k pool of Claude credits sitting behind a workspace-level API\nkey, and we have ~200 students who need to share that pool without eating each\nother's lunch. The Anthropic\n[Admin API](https://platform.claude.com/docs/en/build-with-claude/administration-api)\nwill get us part of the way---you can create and revoke keys programmatically,\nset monthly workspace spend caps in the Console, pull usage reports---but the\ncontrols are too coarse for what we actually want to do: per-student weekly\ntoken allocations with predictable resets and optional carryover, an audit log\nwith\nenough detail to give the course policy around token use some actual teeth, and\na safety net that catches leaked API keys in plaintext before they escape onto\nthe open internet.\n\nThe only way to get all of that is to put a proxy between the students' Claude\nCode sessions and the Anthropic API, and enforce the class-specific policy\nthere. Happily, the School of Computing's infrastructure team has signed on to build\nit and host it on their own infrastructure---this isn't a solo project, and the\nscope gets a lot more realistic with their help. The thing has four jobs.\n\n**Authentication and transparent passthrough.** One real Anthropic API key (tied\nto our COMP4020 workspace) on the egress side. On the ingress side, each student\nhas their own virtual API key---issued by us, revocable by us, completely\nseparate from Anthropic's auth system. Students point their Claude Code config\nat the proxy with their virtual key; from their side, it looks like the\nAnthropic API. Claude Code talks to `/v1/messages` and gets native\nAnthropic-shaped responses back; streaming, tool use, and prompt caching all\nwork unmodified.\n\n**Per-student quotas with time-based reset.** The proxy counts tokens against\neach student's allocation, stops serving requests when they hit their limit, and\nresets on whatever cadence we settle on---probably weekly, probably with some\ncarryover, but that's still\n[an open question](/blog/2026/03/27/managing-the-strategic-token-reserve/).\n\n**Full-traffic logging to a local database.** Every request, every response,\ntied to the student identity that the virtual key resolves to. This is the\nfoundation for the audit trail and (with consent) the research corpus, which\nI'll come back to.\n\n**Leaked-credential detection.** If a student pastes their virtual key into a\nprompt---or any secret matching a known pattern like `sk-ant-api...`---the proxy\ndetects it, auto-suspends the offending key, and alerts us. Accidents happen,\nespecially when students are new to agentic workflows. Better to catch them\nbefore the key ends up in a public GitLab repo.\n\nOf those four jobs, the logging is the piece that takes the most thought,\nbecause it cuts in a few directions at once. The simple answer is that it's the\nenforcement mechanism. Course policies---use only for coursework, no on-selling,\nno harassment, no circumventing academic integrity---are only as real as our\nability to check that they're being honoured. Students will be told this\nexplicitly, at the start of the course: traffic through the class proxy is\nlogged. If they want to use Claude Code outside class for personal projects,\nnothing stops them; that's their business, on their own Anthropic key, not ours.\n\nIt also dovetails with the\n[assessment design](/blog/2026/04/15/comp4020-assessment/) I wrote about last\nweek. Students are already handing in their Claude Code JSONL session logs as\npart of each assignment---those logs live on the student's machine and capture\nthe full local harness state (their `CLAUDE.md`, subagent dispatches, slash\ncommand expansions, and so on). The proxy-side logs are a server-side\ncounterpart. They don't replace the JSONL logs, but they do make certain claims\ncheckable that the client-side logs alone don't. A student can't quietly delete\na JSONL and tell me a different story about what happened; the proxy saw the\ntraffic.\n\nWith consent and anonymisation, aggregated proxy logs also become a research\ncorpus. What does the token-usage curve look like across a 200-student cohort\nworking on the same weekly provocation? When do students hit context limits?\nWhat does session activity look like in the hours right before the\n[aha moment](/blog/2026/04/16/comp4020-pledges-not-questions/)?\n\nThat leaves one practical question: how much of this do we actually have to\nbuild from scratch? [LiteLLM](https://docs.litellm.ai/) is the obvious candidate---the\n[Claude Code docs themselves point at it](https://code.claude.com/docs/en/llm-gateway)\nas a supported LLM gateway. It's MIT-licensed, self-hostable, and Python.\nVirtual keys, spend tracking, and Postgres-backed logging are all first-class.\nMore importantly, there's an Anthropic-native passthrough at\n`/anthropic/v1/messages` that lets Claude Code talk the native protocol rather\nthan being coerced through an OpenAI-compatible translation layer. That last bit\nmatters more than it sounds---a proxy that makes Claude Code work _almost_ like\nthe real thing is worse than no proxy at all.\n\nFor roughly 80% of the brief, it's a drop-in: virtual key CRUD via\n`/key/generate`, `/key/block`, `/key/delete`; the Anthropic-native passthrough;\ndollar-denominated budgets with `budget_duration` set to \"7d\" or \"1mo\" that\nreset automatically; and a `LiteLLM_SpendLogs` Postgres table capturing\nper-request metadata.\n\nThe remaining 20% is where it gets interesting, because it's the class-specific\npolicy stuff that probably _should_ be ours:\n\n- **Carryover.** LiteLLM resets budgets cleanly at the end of each period with\n no rollover. That's a small amount of bookkeeping in a custom hook.\n- **Full prompt and response bodies in the logs.** The default\n `LiteLLM_SpendLogs` table records metadata and token counts, not bodies.\n Getting full transcripts means wiring up a `CustomLogger` callback that writes\n to our own Postgres---which we probably want anyway, so not really a cost.\n- **Secret detection with auto-block.** This is the real gotcha. LiteLLM's\n `hide-secrets` guardrail is Enterprise-only, only fires on their unified\n `/v1/messages` path (not the `/anthropic/*` passthrough we need for Claude\n Code), and even there it's non-streaming-only. Since Claude Code streams,\n effectively _none_ of LiteLLM's built-in secret scanning applies. A custom\n pre-call hook that inspects the prompt, calls `/key/block` on match, and fires\n an alert---that's a couple of hundred lines of Python. Absolutely doable, but\n unavoidable.\n\nBottom line: LiteLLM for the plumbing, custom hooks for the policy. The\nalternative---hand-rolling the whole thing in, say, Elixir or Go---would mean\nreimplementing the virtual-key lifecycle, the Anthropic passthrough, the spend\naccounting, and the admin endpoints. That's not work anyone wants to sign up\nfor when a reasonable baseline already exists. The plan is to start with\nLiteLLM and write the class-specific bits on top, falling back to hand-rolling\nonly if we hit a wall we can't climb with a custom callback.\n\nA few things I'm still thinking through:\n\n- **Auth UX.** The default story is virtual keys---issue one per student, paste\n it into Claude Code config, done. But it'd be nicer if students could\n authenticate through ANU's SSO with their standard uni creds, so that key\n issuance and revocation piggyback on an identity system we already trust.\n LiteLLM supports SSO for its admin UI, but the per-user API auth is still\n token-based, and Claude Code expects an API key anyway. So some kind of token\n still has to land on the student's laptop. The question is what the cleanest\n hand-off looks like for both students and convenors: log in via SSO, get a\n virtual key back, paste it in once and forget? Something more automated?\n- **How do students see their own quota?** Max-plan users of Claude Code get a\n native \"x% of quota remaining, resets at HH:MM\" display; I'd love to piggyback\n on that if there's any way to inject the right headers off our proxy to make\n that UI light up. Failing that, an endpoint students can hit from the CLI, or\n a dashboard in a browser, will do. Either way, I want them watching their own\n token-burn rate as they work---if I'm going to teach them about agentic\n coding, the feedback loop has to be tight.\n- **What's the recovery path when a student runs out of tokens at 11pm the night\n before a deadline?** A manual override via our admin tooling is fine for the\n rare case, but I don't want to normalise \"I ran out, bail me out\"---the whole\n point of a quota is to teach that tokens are a finite resource.\n- **When does the proxy itself become teaching material?** It's a small,\n well-scoped piece of infra that students could plausibly build something\n analogous to in a few weeks---per-user quotas, DB-backed logs, a webhook-style\n callback. I'm tempted to use it as a demo artefact in lectures.\n\nNo doubt more questions will surface once the team sits down to actually build\nthe thing.\n",
"createdAt": "2026-05-13T23:14:36.035Z",
"description": "Designing a proxy to sit between students and the Claude API: per-student quotas, full-traffic logging, and a safety net for leaked keys.",
"path": "/blog/2026/04/22/comp4020-token-management-by-proxy",
"publishedAt": "2026-04-22T00:00:00.000Z",
"site": "at://did:plc:tevykrhi4kibtsipzci76d76/site.standard.publication/self",
"tags": [
"comp4020"
],
"textContent": "Designing a proxy to sit between students and the Claude API: per-student quotas, full-traffic logging, and a safety net for leaked keys.",
"title": "The road to COMP4020: token management by proxy"
}