{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidu2tlccdb7xefpuqrzyw5pkcrjvsmigqa3c6litbwuo65lsscynq",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkdfm24emju2"
  },
  "path": "/t/deepseek-v4-is-live-in-preview-should-your-team-switch/175560#post_1",
  "publishedAt": "2026-04-25T14:24:58.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "DeepSeek officially launched `deepseek-v4-flash` and `deepseek-v4-pro` in preview on April 24, 2026. The legacy routes (`deepseek-chat`, `deepseek-reasoner`) are deprecated July 24, 2026.\n\nI’ve been thinking through how to actually structure a routing decision around this, and wanted to share a few observations:\n\n**The key trade-off is not “DeepSeek vs Claude/GPT” — it’s “which tier for which workload.”**\n\nFlash at $0.14/$0.28 per 1M tokens is a serious candidate for coding agents, repo analysis, and long-context summarization. Pro at $1.74/$3.48 sits between Flash and the premium tier.\n\nThe question I’d be asking: what percentage of your current workloads could Flash handle without quality regression? For many teams doing code gen and repo reading, the answer might be 60-80%. That’s a meaningful cost change.\n\nThe caveat: V4 is preview. Reuters used that word explicitly. Run your own eval set before committing to routing changes, and keep rollback paths to your existing premium routes.\n\n-–\n\nHas anyone already run Flash or Pro against production workloads? Curious what failure modes you’ve seen, if any. The tool-call reliability question is the one I’m most uncertain about.\n\nFull routing analysis with cost examples and migration checklist linked below.",
  "title": "DeepSeek V4 is live in preview — should your team switch?"
}