{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiewapam5gsa6obsoqxzbdkxm27uoirqryxpkhxs32bfroav4g7akm",
    "uri": "at://did:plc:dz7fbvkxedbwlm4sroohfpee/app.bsky.feed.post/3mkbiz74y5pm2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreihwt7zgao7c2qvoyqtegc6cndyinuu7swiuqnhr4fxzcaodjaycxu"
    },
    "mimeType": "image/jpeg",
    "size": 21516
  },
  "description": "DeepSeek unveils the V4 series with a million-token context, new Sparse Attention, and open weights, aiming for open-source SOTA performance. ",
  "path": "/deepseek-released-3-new-open-source-v4-models/",
  "publishedAt": "2026-04-24T21:49:38.000Z",
  "site": "https://www.testingcatalog.com",
  "tags": [
    "pic.twitter.com/n1AgwMIymu",
    "April 24, 2026",
    "Hugging Face",
    "@deepseek_ai"
  ],
  "textContent": "DeepSeek has released the preview version of its long-anticipated V4 series, pushing its open-source lineup into million-token territory with two Mixture-of-Experts variants. The Hangzhou-based lab announced the drop on April 24, confirming months of speculation after earlier target windows in February and March slipped. V4-Pro ships with 1.6 trillion total parameters and 49 billion active per token, while V4-Flash runs on 284 billion total and 13 billion active, both defaulting to a 1M context window as standard rather than an optional tier.\n\nThe structural headline is a new attention scheme pairing token-level compression with DeepSeek Sparse Attention, which the team credits for cutting long-context compute and memory costs sharply enough to make million-token inputs the baseline across DeepSeek services. Both variants expose Thinking and Non-Thinking modes through the API, with a reasoning_effort parameter letting developers dial how hard the model deliberates per task.\n\n> 🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.\n>\n> 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.\n> 🔹 DeepSeek-V4-Flash: 284B total / 13B active params.… pic.twitter.com/n1AgwMIymu\n>\n> — DeepSeek (@deepseek_ai) April 24, 2026\n\nBenchmarks released alongside the models put V4-Pro neck-and-neck with Claude Opus 4.6, GPT-5.4 xHigh, and Gemini 3.1 Pro across knowledge, reasoning, and agentic tasks. It posts a Codeforces rating of 3206, claims open-source state-of-the-art on agentic coding, and trails only Gemini on world knowledge among frontier models. Flash sits close behind Pro on reasoning and matches it on simpler agent workflows at a fraction of the inference cost.\n\nDeepSeek has tuned V4 to slot into established agent stacks including Claude Code, OpenClaw, and OpenCode, and the API accepts both OpenAI Chat Completions and Anthropic-format calls, teams only need to swap the model name. The older deepseek-chat and deepseek-reasoner routes now alias to V4-Flash and retire on July 24.\n\nWeights are live on Hugging Face under an open license, continuing the playbook that turned V3 and R1 into reference points for the open-source camp. With domestic pressure from Alibaba, Xiaomi's MiMo, and Moonshot's Kimi tightening, V4 positions DeepSeek as the lab defining the ceiling for open-weight frontier models rather than chasing closed-source incumbents, and the Huawei Ascend optimization path underscores the parallel push toward a China-native compute stack.",
  "title": "DeepSeek released 3 new open-source V4 models",
  "updatedAt": "2026-04-24T21:49:38.843Z"
}