Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidlrngqdwr4lz3wqpzhrbdpwewkkwy5bjbjr34plz5iq4xc2crlq4",
    "uri": "at://did:plc:vj6bnc74m2zgkyu7wxw2d5vv/app.bsky.feed.post/3mnewaxb5yte2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreihoarjye3q6jx7libwkqap34jgpm34qkukvhzqrzb6i77mbad36he"
    },
    "mimeType": "image/png",
    "size": 303446
  },
  "description": "While rivals race to build ever more specialized AI hardware, Paris-based Kog claims it can achieve dedicated-silicon speeds on standard GPUs. If it's right, the future of real-time AI may depend less on new chips than on better software.",
  "path": "/the-ai-industry-spent-billions-chasing-faster-chips-inference-startup-kog-says-they-were-solving-the-wrong-problem/",
  "publishedAt": "2026-06-03T10:40:17.000Z",
  "site": "https://www.frenchtechjournal.com",
  "tags": [
    "Kog",
    "Subscribe now"
  ],
  "textContent": "There's a story the AI industry has been telling itself lately about what must happen to compete in the agentic era. It goes something like this: if you want truly real-time inference that is fast enough to make an AI agent feel less like a vending machine and more like a colleague, then you need to buy new hardware.\n\nThis means specialized silicon in the form of the exotic chips that Cerebras, Groq (which was kinda-sorta bought by Nvidia), and SambaNova have spent years and billions building. The thesis got a very loud endorsement last month, when Cerebras went public in a blockbuster IPO that valued it at $56 billion and confirmed that fast inference is now its own infrastructure category.\n\nA small team in Paris would beg to differ.\n\nKog, an 11-person AI infrastructure startup founded in 2023, has just opened a public tech preview of its inference engine, making a claim that runs counter to prevailing wisdom. On a single node of eight AMD MI300X GPUs (the kind already humming away in enterprise datacenters), Kog says it generates more than 3,000 output tokens per second for a single user request, putting it in the same speed bracket as the dedicated-silicon crowd, but on standard kit.\n\nThe pitch is potentially enticing as the AI frenzy gives way to anxiety over the reality of soaring operating costs: you may not need to migrate to a new hardware ecosystem to get dedicated-silicon speeds. You might just need someone to use the GPUs you already own a lot more cleverly.\n\n## \"It's not only the hardware.\"\n\n\"A growing part of the AI industry assumes that truly real-time AI [would] require entirely new hardware architectures,\" said Nicolas Constant, Kog's Sales & Talent Lead, in an interview ahead of the launch. He noted that the recent Cerebras IPO \"reinforced it even further.\"\n\n### This post is for subscribers only\n\nBecome a member to get access to all content\n\nSubscribe now",
  "title": "The AI Industry Spent Billions Chasing Faster Chips. Inference Startup Kog Says They Were Solving the Wrong Problem.",
  "updatedAt": "2026-06-03T12:42:30.105Z"
}