Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreih7apwlzfor4e3e6rv5eaqayywywrvztzhxdr6i377ctittqlz2kq",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mpne2bubdou2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreifm2wjnfyby6xpiyqxh6bmionvjxmhqxuqiygvxb2ujnybiucyxhi"
    },
    "mimeType": "image/webp",
    "size": 166320
  },
  "path": "/tokenmixai/i-did-the-math-on-claude-sonnet-5-the-60-opus-discount-is-real-but-temporary-31pf",
  "publishedAt": "2026-07-02T05:55:07.000Z",
  "site": "https://dev.to",
  "tags": [
    "ai",
    "anthropic",
    "claude",
    "programming",
    "Anthropic's launch post",
    "TokenMix",
    "original article"
  ],
  "textContent": "Anthropic shipped Claude Sonnet 5, and the takes I saw were predictable:\n\n\"It replaces Opus.\"\n\n\"It is just another Sonnet refresh.\"\n\n\"The benchmark chart means you can route everything to it now.\"\n\nTwo of those are wrong. One is directionally right, but only if you care about cost per task instead of model prestige.\n\nI spent time going through Anthropic's launch post, the Claude Platform docs, GitHub's Copilot rollout note, and the pricing math. The conclusion I landed on is simple: **Sonnet 5 should be the default Claude model for most coding agents, but it should not be your highest-stakes escalation model.**\n\n##  TL;DR\n\n  * **No, Sonnet 5 does not universally replace Opus 4.8.** Anthropic says it can match Opus on some higher-effort tasks, not all tasks.\n  * **Yes, the discount is real.** Intro pricing is $2 input / $10 output per million tokens through August 31. Opus 4.8 is $5/$25.\n  * **The real number is 60%.** During the intro period, Sonnet 5 costs 40% of Opus 4.8, meaning a 60% discount on both input and output.\n  * **After August 31, the math changes but still works.** Sonnet 5 moves to $3/$15, still 40% cheaper than Opus 4.8.\n  * **My routing rule:** use Sonnet 5 for the first pass, Opus 4.8 for escalation, and Fable 5 only when the task justifies frontier-tier cost.\n\n\n\n##  What actually shipped\n\nAnthropic launched Claude Sonnet 5 on June 30, 2026.\n\nThe important part is not just the model. It is the availability.\n\nSonnet 5 is available across Claude Free, Pro, Max, Team, Enterprise, Claude Code, Claude Cowork, and the Claude Platform API, according to Anthropic's launch post. GitHub also made Sonnet 5 generally available in Copilot on June 30, which means this model landed directly inside developer workflows, not just API dashboards.\n\nThat matters because the frontier tier is noisy right now:\n\nModel / product | Current reality\n---|---\nClaude Fable 5 | Back online, but expensive and policy-sensitive\nClaude Mythos 5 | Narrower access\nGPT-5.6 | Gated preview, not broadly available\nGemini 3.5 Pro | Reported July target, not public API yet\nClaude Sonnet 5 | Broadly available now\n\nThis is why I care about Sonnet 5 more than the louder frontier-model drama.\n\nIt is the model developers can actually use this week.\n\n##  The pricing table that changed my mind\n\nThe pricing is the story.\n\nModel | Input / 1M | Output / 1M | What it means\n---|---|---|---\nClaude Sonnet 5 intro | $2.00 | $10.00 | Through August 31, 2026\nClaude Sonnet 5 standard | $3.00 | $15.00 | After August 31\nClaude Sonnet 4.6 | $3.00 | $15.00 | Same as post-intro Sonnet 5\nClaude Opus 4.8 | $5.00 | $25.00 | Higher-end stable route\nClaude Fable 5 | $10.00 | $50.00 | Frontier-priced route\n\nDuring the intro window, Sonnet 5 is not a small discount.\n\nIt is 60% cheaper than Opus 4.8.\n\nAfter August 31, it is still 40% cheaper.\n\nThat is enough to change your default route even if you keep Opus for final review.\n\n##  The $300/month example\n\nTake a modest agent workload:\n\n  * 50M input tokens per month\n  * 10M output tokens per month\n\n\n\nThe bill:\n\n\n\n    Sonnet 5 intro = 50 * $2 + 10 * $10 = $200\n    Sonnet 5 standard = 50 * $3 + 10 * $15 = $300\n    Opus 4.8 = 50 * $5 + 10 * $25 = $500\n\n\nThat means:\n\nRoute | Monthly cost | Savings vs Opus\n---|---|---\nSonnet 5 intro | $200 | $300\nSonnet 5 standard | $300 | $200\nOpus 4.8 | $500 | $0\n\nIf your team is running agents against repos every day, this is not theoretical.\n\nIt is the difference between routing every routine fix to Opus because \"it is safer\" and using Opus only when the first pass needs escalation.\n\n##  The output-token trap\n\nMost agent costs hide in output.\n\nA coding agent does not just answer one question. It plans, edits, explains, retries, opens diffs, writes tests, and summarizes.\n\nSuppose each run emits 12K output tokens and you run 5,000 agent tasks per month.\n\nThat is:\n\n\n\n    12,000 output tokens * 5,000 runs = 60,000,000 output tokens\n\n\nOutput-only cost:\n\n\n\n    Sonnet 5 intro = 60 * $10 = $600\n    Opus 4.8 = 60 * $25 = $1,500\n\n\nThat is a $900/month difference before counting input tokens.\n\nI would rather spend that $900 on extra evals, better logging, or escalation for the tasks that actually need Opus.\n\n##  The benchmark caveat people will skip\n\nAnthropic says Sonnet 5 improves over Sonnet 4.6 and can match Opus 4.8 at higher effort on some agentic tasks.\n\nThat sentence has two important words: **some tasks**.\n\nAnthropic also edited one launch chart after a methodology issue around BrowseComp. I do not read that as a scandal. I read it as a warning: do not build your routing policy from one vendor chart.\n\nMy benchmark policy for Sonnet 5 would be:\n\nTest set | Size | Pass condition\n---|---|---\nBug fixes | 50 tasks | Same or better accepted patch rate\nRepo Q&A | 50 tasks | Same or better factual accuracy\nCode review | 50 tasks | Same or better defect catch rate\nRefactors | 25 tasks | No higher regression rate\nLong-context tasks | 25 tasks | No worse truncation or drift\n\nI do not need Sonnet 5 to beat Opus on every task.\n\nI need it to be good enough for the first pass and cheap enough to run more often.\n\nThat is a very different requirement.\n\n##  The \"should I migrate?\" decision tree\n\nHere is the router I would start with.\n\n\n\n    def pick_claude_model(task):\n        if task in [\n            \"repo_search\",\n            \"unit_test_fix\",\n            \"routine_refactor\",\n            \"doc_summary\",\n            \"first_pass_pr_review\",\n        ]:\n            return \"claude-sonnet-5\"\n\n        if task in [\n            \"security_review\",\n            \"legal_reasoning\",\n            \"architecture_decision\",\n            \"final_pr_review\",\n        ]:\n            return \"claude-opus-4.8\"\n\n        if task == \"frontier_research\" and has_approved_fable_access():\n            return \"claude-fable-5\"\n\n        return \"claude-sonnet-5\"\n\n\nThat default is opinionated on purpose.\n\nI do not want a router that starts expensive and occasionally tries cheaper models.\n\nI want a router that starts with the cheap capable model, then escalates only when the task earns it.\n\n##  Where I would not use Sonnet 5\n\nSonnet 5 is not the answer to everything.\n\nWorkload | I would use instead | Why\n---|---|---\nCheap summarization | Haiku or smaller route | Sonnet is overkill\nMassive batch extraction | Batch + cheaper model | Price still compounds\nFinal high-stakes review | Opus 4.8 | Better escalation baseline\nApproved frontier cyber work | Fable/Mythos route | Different capability tier\nOpen-weight local coding | GLM or Kimi route | Cost/control may win\nUnverified benchmark chasing | Wait | Vendor charts are not enough\n\nThis is the trap with every new model release.\n\nPeople ask, \"Is it better?\"\n\nThe production question is, \"Where is it good enough to become cheaper by default?\"\n\nFor Sonnet 5, that answer is most routine agent work.\n\n##  What I'd do if I were running a dev team this week\n\nIf I owned the model routing layer, I would do five things.\n\n  1. Move routine Claude agent traffic from Sonnet 4.6 to Sonnet 5.\n  2. Move first-pass Opus traffic to Sonnet 5 where evals pass.\n  3. Keep Opus 4.8 as the escalation route for final review and high-stakes reasoning.\n  4. Track accepted patch rate, retry rate, output tokens, and human review minutes.\n  5. Re-run the cost model before August 31, because the intro price expires.\n\n\n\nThat last one matters.\n\nThe intro price makes migration look extremely obvious. The standard price still looks good, but the savings shrink.\n\nDate | Input / 1M | Output / 1M | Routing implication\n---|---|---|---\nNow through Aug. 31 | $2 | $10 | Aggressively test migration\nAfter Aug. 31 | $3 | $15 | Still default, but re-check margins\n\nDo not let a temporary discount become an unmeasured permanent assumption.\n\n##  The bigger picture\n\nSonnet 5 is part of a pattern I think more teams should notice.\n\nThe most important model in production is often not the strongest model. It is the model with the best mix of availability, cost, latency, and enough intelligence for the common path.\n\nThat is why Sonnet 5 matters.\n\nFable 5 is more dramatic. GPT-5.6 is more mysterious. Gemini 3.5 Pro will probably get the launch-week attention when it lands.\n\nBut Sonnet 5 is the boring model that can lower a lot of real bills.\n\nAnd boring models that lower bills tend to win production traffic.\n\n##  Disclosure\n\nIf you want to swap between Claude, OpenAI, Gemini, DeepSeek, Qwen, GLM and other models through one OpenAI-compatible endpoint, that is roughly what TokenMix does. Disclosure: I work on the research side. Full cited breakdown is on the original article.\n\n##  Bottom line\n\nClaude Sonnet 5 should be your default Claude agent route, not your prestige model and not your only model.\n\nUse it for first-pass coding, refactors, PR review, repo Q&A, and routine tool use. Keep Opus 4.8 for escalation. Keep Fable 5 for the narrow slice that justifies frontier-tier cost.\n\nThe model release is good. The routing discipline is what saves the money.\n\nWould you route routine coding agents to Sonnet 5 by default, or keep paying for Opus until independent evals catch up?",
  "title": "I Did the Math on Claude Sonnet 5. The 60% Opus Discount Is Real, But Temporary."
}