{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreidlrngqdwr4lz3wqpzhrbdpwewkkwy5bjbjr34plz5iq4xc2crlq4",
"uri": "at://did:plc:vj6bnc74m2zgkyu7wxw2d5vv/app.bsky.feed.post/3mnewaxb5yte2"
},
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreihoarjye3q6jx7libwkqap34jgpm34qkukvhzqrzb6i77mbad36he"
},
"mimeType": "image/png",
"size": 303446
},
"description": "While rivals race to build ever more specialized AI hardware, Paris-based Kog claims it can achieve dedicated-silicon speeds on standard GPUs. If it's right, the future of real-time AI may depend less on new chips than on better software.",
"path": "/the-ai-industry-spent-billions-chasing-faster-chips-inference-startup-kog-says-they-were-solving-the-wrong-problem/",
"publishedAt": "2026-06-03T10:40:17.000Z",
"site": "https://www.frenchtechjournal.com",
"tags": [
"Kog",
"Subscribe now"
],
"textContent": "There's a story the AI industry has been telling itself lately about what must happen to compete in the agentic era. It goes something like this: if you want truly real-time inference that is fast enough to make an AI agent feel less like a vending machine and more like a colleague, then you need to buy new hardware.\n\nThis means specialized silicon in the form of the exotic chips that Cerebras, Groq (which was kinda-sorta bought by Nvidia), and SambaNova have spent years and billions building. The thesis got a very loud endorsement last month, when Cerebras went public in a blockbuster IPO that valued it at $56 billion and confirmed that fast inference is now its own infrastructure category.\n\nA small team in Paris would beg to differ.\n\nKog, an 11-person AI infrastructure startup founded in 2023, has just opened a public tech preview of its inference engine, making a claim that runs counter to prevailing wisdom. On a single node of eight AMD MI300X GPUs (the kind already humming away in enterprise datacenters), Kog says it generates more than 3,000 output tokens per second for a single user request, putting it in the same speed bracket as the dedicated-silicon crowd, but on standard kit.\n\nThe pitch is potentially enticing as the AI frenzy gives way to anxiety over the reality of soaring operating costs: you may not need to migrate to a new hardware ecosystem to get dedicated-silicon speeds. You might just need someone to use the GPUs you already own a lot more cleverly.\n\n## \"It's not only the hardware.\"\n\n\"A growing part of the AI industry assumes that truly real-time AI [would] require entirely new hardware architectures,\" said Nicolas Constant, Kog's Sales & Talent Lead, in an interview ahead of the launch. He noted that the recent Cerebras IPO \"reinforced it even further.\"\n\n### This post is for subscribers only\n\nBecome a member to get access to all content\n\nSubscribe now",
"title": "The AI Industry Spent Billions Chasing Faster Chips. Inference Startup Kog Says They Were Solving the Wrong Problem.",
"updatedAt": "2026-06-03T12:42:30.105Z"
}