Realtime regression in non-English production voice agents: gpt-realtime-mini vs gpt-realtime-mini-2025-10-06
Unfortunately, switching to gpt-realtime-2 is out of the question for us at scale.
The issue is unit economics. gpt-realtime-2 is roughly 3x more expensive than gpt-realtime-mini on tokens, which would force us to price our AI voice services above what the local market can realistically absorb.
In Romania, we are competing with local providers offering very cheap per-minute AI voice pricing. More importantly, our product needs to remain meaningfully cheaper than the hourly cost of human labor at local wage levels. If our AI call center pricing cannot undercut local minimum-wage labor economics, it loses a large part of its commercial appeal.
This is especially important because our local market is currently under strong economic pressure. Many clients are not buying AI voice agents as a luxury upgrade; they are evaluating them as a way to reduce costs, handle more calls with fewer resources, or improve ROI through cheaper operational methods.
If moving from gpt-realtime-mini to gpt-realtime-2 forces our per-minute pricing above local competitor offers and too close to or above local labor-cost equivalents, it would kill our sales momentum.
That is why the mini tier is critical for us. The problem is not that we need “the most intelligent model available.” We need a mini-class Realtime model that preserves the cost profile of gpt-realtime-mini while maintaining the Romanian/non-English faithfulness and production reliability we saw in gpt-realtime-mini-2025-10-06.
For our use case, gpt-realtime-mini-2025-10-06 hit the viable balance: low enough cost, acceptable Romanian voice quality, and strong faithfulness to supplied business data. A 3x cost increase is not a workable migration path in our market.
Discussion in the ATmosphere