{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifyfn2eqm5u4n32kgb4knmadqchdtq3v2qka4wnc3l4ac5uqz2zsi",
    "uri": "at://did:plc:qzjwstutqk2cy7df7jbzd2hx/app.bsky.feed.post/3mkcviqwfby52"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreif2y7mzhpbtrsclcjrchlzo4ayqqxgtdwulob45dem36yod7hcukm"
    },
    "mimeType": "image/jpeg",
    "size": 294304
  },
  "path": "/article/4163282/cirrascale-to-offer-on-prem-google-gemini-models.html",
  "publishedAt": "2026-04-24T16:32:58.000Z",
  "site": "https://www.networkworld.com",
  "tags": [
    "Artificial Intelligence, Data Center",
    "Gemini models either on-prem",
    "Cirrascale",
    "Dave Driggers, CEO of Cirrascale",
    "tokenomics"
  ],
  "textContent": "Cirrascale Cloud Services has announced it will make artificial intelligence models available for on-premise use through Google Distributed Cloud, a move aimed at organizations that want advanced AI capabilities while keeping data inside their own firewall.\n\nThe company said enterprise and public-sector agencies will be able to run Gemini models either on-prem or in Cirrascale data centers, including in connected or fully air-gapped deployments, to address data sovereignty and regulatory requirements.\n\nCirrascale said the offering expands its inference platform to support Gemini on Google Distributed Cloud, positioning the service for industries such as government, defense, finance, healthcare and higher education.\n\nCirrascale runs on-prem Gemini on a Dell-made appliance running Intel and Nvidia CPUs and GPUs but doesn’t use Google’s vaunted Tensor Processing Unit (TPU). It takes the appliance from Dell and installs the Gemini and GDC on the appliance and are able to deliver that as a service to the clients.\n\nDave Driggers, CEO of Cirrascale, said customers won’t get the same performance they would get with a TPU, but they do get more than adequate in performance. “They’re really the only other training platform separate of Nvidia, where you’ve got a full stack, you’ve got the processors, the networking, the software stack is all integrated top to bottom,” he said.\n\nCirrascale said the deployment model is designed for customers with strict data residency rules or low-latency needs by keeping computing resources close to where data is stored and processed.\n\nGoogle Distributed Cloud can be deployed in customer-controlled environments, including installations that are disconnected from the Internet, which is a key requirement for some government and critical-infrastructure users.\n\nOne of the big challenges is that these models are incredibly valuable and they need to be delivered in a trusted, secure environment, said Driggers. “That’s what’s really the most important thing to Google, is this model. So they need to be delivered in a confidential compute manner,” he said.\n\nThe model is not stored on a hard drive; it is stored in memory. If there’s any intrusion to the machine, the machine basically turns itself off, and the model is gone, so it cannot be stolen, according to Cirrascale.\n\nCirrascale said it will provide the hardware configurations, performance tuning and support needed to run Gemini inference at scale as part of its Cirrascale Inference Platform.\n\nThe company said the service is aimed at customers that want a production environment without rebuilding existing infrastructure and includes what it described as optimized systems for Gemini inference and ongoing operational support.\n\n“It’s Google’s model. Our secret sauce is being a trusted partner to be able to deliver that model to the clients,” said Driggers. “It’s part of our inference as a service offering. So for our customers, we have a software layer on top of the model that allows them to tailor how they use it, so they can set user queues up and set user limitations.”\n\nThis allows subscribers to engage in tokenomics, so they can have a knowledge worker who gets a different token rate than, say, a high end programmer that needs to get a job done quickly.\n\nThe service can also distribute Gemini if the customers spread across multiple regions, and the company does load balancing for the end user, according to the vendor.\n\nThe service is just starting previews now and general availability is planned for late June or early July.",
  "title": "Cirrascale to offer on-prem  Google Gemini models"
}