{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifj2hvipwqyqbiefydt6utqk7m44grrhjrvwyroj2hy5jspcvv2xa",
    "uri": "at://did:plc:dz7fbvkxedbwlm4sroohfpee/app.bsky.feed.post/3mn44fhoviaw2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreigyivd5dcwbcavfpld2lx5lbcnvj6omokw6z3txxjlf6zvlbyy3iq"
    },
    "mimeType": "image/jpeg",
    "size": 70939
  },
  "description": "Microsoft is preparing several AI models, including MAI-Image-2.5, MAI-Transcribe-1.5, and multilingual MAI-Voice-2, for unveiling at Build 2026.",
  "path": "/microsoft-readies-new-mai-voice-and-image-models-for-build-2026/",
  "publishedAt": "2026-05-30T22:36:10.000Z",
  "site": "https://www.testingcatalog.com",
  "tags": [
    "MSBuild",
    "pic.twitter.com/g3WQIcIQ24",
    "June 2, 2026",
    "MAI-Image-2.5",
    "@MicrosoftAI",
    "@GoogleDeepMind",
    "@OpenAI",
    "https://t.co/stHydZYbNN",
    "pic.twitter.com/4eVXxfbI6M",
    "May 26, 2026",
    "coding model",
    "super app",
    "@arena"
  ],
  "textContent": "UPDATE: Microsoft has announced 7 new AI models during Microsoft Build 2026 - MAI Image 2.5, MAI Image 2.5 Flash, MAI Voice 2, MAI Voice 2 Flash, MAI Transcribe 1.5, MAI Code 1 Flash, and MAI Thinking 1.\n\n> Seven new models launching at Build: let’s go!\n> Reasoning. Code. Image. Transcribe. Voice.\n>\n> Built from scratch on a clean data lineage, designed for efficiency, working seamlessly as a family of models\n>\n> Thread 🧵 #MSBuild pic.twitter.com/g3WQIcIQ24\n>\n> — Microsoft AI (@MicrosoftAI) June 2, 2026\n\n### The Story\n\nMicrosoft heads into its Build conference on June 2 in San Francisco with more in its model pipeline than the MAI-Image-2.5 that it has already shown on Arena, where the text-to-image system landed third behind OpenAI’s gpt-image-2 and Google’s Nano Banana 2. That release is lined up for the MAI Playground and Foundry, but three additional models are taking shape within the company’s stack, none of which are publicly available yet.\n\n> Exciting news, MAI-Image-2.5 (Preview) from @MicrosoftAI debuts at #3 in the Text-to-Image Arena with a score of 1,254 — a +72 point improvement over MAI-Image-2.\n>\n> A top 5 arena previously held only by @GoogleDeepMind and @OpenAI has a new lab in the mix.\n>\n> Congrats to the… https://t.co/stHydZYbNN pic.twitter.com/4eVXxfbI6M\n>\n> — Arena.ai (@arena) May 26, 2026\n\nThe first, **MAI-Transcribe-1.5** , is a modest step up from the speech-to-text model launched in April, which already claimed the lowest word error rate across 25 languages. The image side draws more attention: **MAI-Image-2.5** looks set to ship in two variants, a high-quality version and a faster one labeled **MAI-Image-2.5e** , mirroring the split seen with MAI-Image-2. It would also accept image uploads, opening the model to editing as well as generation, putting it on par with rivals from Google and OpenAI.\n\nThe most striking find is **MAI-Voice-2** , a multilingual successor to the company’s text-to-speech model. While MAI-Voice-1 began in English, the new version adds German, Australian and US English, Spanish, French, Hindi, Indonesian, Italian, Japanese, Korean, Dutch, Portuguese, Turkish, Vietnamese, and Chinese, with a wider emotional range that covers tones such as angry, confused, and embarrassed. Early samples suggest it can whisper, too.\n\nHarper whisper egret\n\n0:00\n\n/18.8\n\n1×\n\nEthan shouting egret\n\n0:00\n\n/15.696\n\n1×\n\nField isla joyful\n\n0:00\n\n/16.176\n\n1×\n\nAll three would feed Copilot, Teams, and Azure Speech, and fit the developer crowd that Build is made for. The timing matches a broader push, as Mustafa Suleyman’s team weans the company off OpenAI following April’s renegotiation. Reports point to a homegrown coding model for GitHub Copilot at the show, too, while a Copilot “super app” that integrates chat, coding, and agents into a single hub is expected later in the summer.",
  "title": "Microsoft released new MAI voice and image models for Build 2026",
  "updatedAt": "2026-06-02T21:29:46.977Z"
}