Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreih6qwdm5sn7njngfq5otock76cxxxtte7a23vlcuk4opsh35du3su",
    "uri": "at://did:plc:ehofn2arm3q4dcthv4lhhhc4/app.bsky.feed.post/3mgecquatdwd2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreieo7xt2nnj6acwrff6qdiehkjbkp4hgjj6pafy72633frdzfgjz54"
    },
    "mimeType": "image/jpeg",
    "size": 716150
  },
  "description": "What do you do when a new innovation increases costs??\n\nS3T PodCast March 6, 20260:00/916.97632653061231×\n\n\nDire Straits\n\nAt time of writing/recording, the Iran conflict is about to enter its 7th day, as capitals and economic hubs worldwide are watching for indications of how long this conflict will last. The longer it goes on, the more we risk a global inflationary shock. The US average price for a gallon of gas is up 27 cents this week.\n\nThis comes at a particularly bad time for the already vo",
  "path": "/raise-your-teams-cost-iq-ram-ai-supply-chain-inflation-new-tradeoffs-for-local-vs-cloud/",
  "publishedAt": "2026-03-06T02:54:54.000Z",
  "site": "https://www.s3t.org",
  "tags": [
    "a gallon of gas is up 27 cents this week",
    "Marinetraffic.com",
    "up 90% so far in 2026",
    "already occurred in 2025",
    "observes Matteo Rinaldi",
    "Counterpoint Research",
    "This Deloitte paper"
  ],
  "textContent": "_What do you do when a new innovation**increases** costs??_\n\nS3T PodCast March 6, 2026\n\n0:00\n\n/916.9763265306123\n\n1×\n\n### Dire Straits\n\nAt time of writing/recording, the Iran conflict is about to enter its 7th day, as capitals and economic hubs worldwide are watching for indications of how long this conflict will last. The longer it goes on, the more we risk a global inflationary shock. The US average price for a gallon of gas is up 27 cents this week.\n\nTrack Oil Tankers and other Shipping at Marinetraffic.com\n\n**This comes at a particularly bad time for the already volatile economics of compute.** By disrupting the energy supply chain, the conflict is directly inflating the cost of energy—the primary input for AI compute. This will add to the feedback loop of rising operational costs and market volatility across _all digitized industries_.\n\nThat's why this week we need to dig into the crucial intersection where technology, economics and skillsets come together. We want to be grounded in:\n\n  * The economic realities now impacting tech choices we make and\n  * The specific skills and actions teams can use to reposition for best possible outcomes.\n\n\n\n### What seems unique about this time period\n\nFor the tech history that I have witnessed dating back to the early 90's - all the waves of innovation always tended to _reduce costs and raise productivity_ in undeniable ways.\n\nWhat seems unique about where we are today, is that**the current wave of technology innovation is actually _increasing_ costs, **and not creating consistently meaningful and observable productivity gains.\n\nThis is not to say there will be no productivity gains or benefits. Or that costs structures will never streamline. **But it does require us to get smarter and more selective about how we apply these new technologies.** Smarter and more selective won't come from bringing the right speaker to your next board meeting or investment committee: it will come from empowering your team to experiment and learn how to match your company's compute workloads to _the most cost effective form of compute_.\n\nThat's what we'll dig into today. Let's start by familiarizing ourselves with the current economics of compute.\n\nClick to Zoom\n\n### The inflationary impact of hyperscalers\n\nRAM prices are up 90% so far in 2026 as chip manufacturers (Samsung, SK Hynix, and Micron) redirect massive amounts of their chip production away from consumer products to **High Bandwidth Memory (HBM)** to satisfy demand for AI accelerators like Nvidia’s H100 and B200. This comes on top of sharp price increases that already occurred in 2025.\n\n“Just to give you an idea, a single AI server can use as much advanced memory as a dozen or even hundreds of traditional laptops” observes Matteo Rinaldi who sees this as a more structural long lasting issue than the temporary supply chain shocks during Covid.\n\nImplications:\n\n  * Digital products inflation: rising prices for consumer products - phones, computers, consoles - as OpenAI, Microsoft, Google and others grab as much processing power and chip supply as they can.\n  * Planning dilemmas for teams. If you bought hardware prior to the RAM cost spike, find a way to use it.\n\n\n\nCounterpoint Research\n\n###\n\n### Connecting the economic dots\n\nThe massive investment poured into AI big tech - that pushed stocks to new highs amid otherwise flat to negative conditions - has to get paid back ..._somehow_. As shown in the AI supply chain diagram above, this is going to get complicated.\n\nTech titans are betting they can create ROI by bringing enterprises and consumers into their controlled cloud environments and charging for usage.\n\nBut businesses are starting to learn what this means for their bottom line. So far the cost savings from enterprise AI have not kept up with the rising costs of using enterprise AI - _especially agentic AI_.\n\nThis Deloitte paper notes that proactive organizations are starting to rethink compute - specifically how they **match workloads to processing power:**\n\n_\"On-prem compute may be more cost effective than cloud for consistent, high-volume workloads...when cloud costs begin to exceed 60% to 70% of the total cost of acquiring equivalent on-premises systems, making capital investment more attractive than operational expenses for predictable AI workloads.\"_\n\nAs companies become more cost conscious, and the realization of limited ROI takes hold, there will likely be a **pricing strategy competition** (in the AI value chain) as AI players scramble to shore up their margins and - hopefully - pay back their investors. This will not be fun. It will likely bear some resemblance to bankruptcy court proceedings.\n\nCompanies and industries over-exposed to this desperation cycle will struggle to control their margins and competitive pricing. Be careful of locking into multi-year contracts with consumption based pricing.\n\n### Hybrid cloud takes on new relevance\n\nTo manage your exposure to the economics of compute, do a careful analysis of costs based on the size and type of your compute workloads. The \"everything to cloud\" mantra may not make sense for some of your compute workloads.\n\nYour goal should be to gain a clear understanding of:\n\n  * What workloads are a good match for smaller LLMs that you run on your own hardware?\n  * What workloads are complex enough to warrant exploratory subscriptions to high cost frontier models?\n  * If you have high volume high complexity workloads, are they generating sufficient revenue to pay for inflated compute costs while also delivering acceptable margins?\n\n\n\nKnow the point where it makes more sense to capitalize hardware vs carry ongoing cloud costs.\n\nBottom line: If you purchased hardware between the pandemic price spike and the AI price spike (roughly 2022-2025 Q1), you have a structural cost advantage others don't.\n\n### Raise the Cost IQ of your team\n\nForward looking orgs are encouraging their technical teams to get familiar with what LLMs can be effectively run from owned hardware at different levels of scale:\n\n  * What can we run off a laptop?\n  * What can we run on low end servers?\n  * What can we run on high end servers?\n  * What absolutely requires and justifies the cost of OpenAI, Anthropic and the like?\n\n\n\nThis means encouraging teams to experiment with smaller LLMs to see what kind of performance they can squeeze out of non-cloud configurations.\n\nFor routine coding tasks—syntax completions, formatting suggestions, or code snippets—an 8-billion-parameter model can handle common patterns and syntax efficiently. Programming and scripting languages have well documented, finite rules that smaller models can manage adequately.\n\nIf developers in your org need complex reasoning—like understanding project-wide architecture, cross-file logic, or nuanced refactoring—then larger models can offer deeper insights. But if the tasks are mostly “fill in this function” or “suggest this loop,” a local model could be a great cost-saving baseline. Start local—and if you hit limitations, scale up selectively.\n\n### Takeaways\n\nThe pervasive thought pattern of 2024-2025 - _jump in with both feet, costs don't matter_ - hasn't aged well. It's time for teams to build their internal expertise in how to match compute workloads against the most cost effective compute - vs blindly subscribing to whatever the cloud vendors are offering.\n\nDecisions about how to purchase compute - when local is enough and when to scale up - need to be informed by hands on experimentation. This will allow you to approach AI investments armed with clear comparisons, not guesswork. It will also boost your team's resilience and flexibility, so you can adapt as workloads - and their economics - continue to evolve.\n\nHave a great weekend, and a wonderful week ahead.\n\nRalph\n\n* * *\n\n _Opinions expressed are those of the individuals and do not reflect the official positions of companies or organizations those individuals may be affiliated with. Not financial, investment or legal advice, and no offers for securities or investment opportunities are intended. Mentions should not be construed as endorsements. Authors or guests may hold assets discussed or may have interests in companies mentioned._",
  "title": "Raise your team's cost IQ: RAM, AI supply chain inflation, new tradeoffs for local vs cloud",
  "updatedAt": "2026-03-06T03:35:18.370Z"
}