External Publication

Are There Any Good Benchmarks Comparing OpenAI API Models?

OpenAI Developer Community June 17, 2026

I’m looking for benchmark results that compare OpenAI models specifically on mathematical reasoning. Most of the discussions I find are focused on coding or general reasoning, but I’m interested in seeing how the current models perform on benchmarks such as AIME, FrontierMath, or other math-focused evaluations. Does anyone have links to benchmark comparisons or personal experience using OpenAI models for math-heavy workloads?

Discussion in the ATmosphere