External Publication
Visit Post

Low logical reasoning performance of GPT-5.2 at medium and high reasoning effort levels

OpenAI Developer Community March 6, 2026
Source

I tested GPT 5.4. Looks like whatever problem there was that caused observed steep fall of scores as the benchmark difficulty increased is now fixed. But it’s not all sunshine and rainbows as GPT 5.4 xhigh performs worse than GPT 5.1 high, GPT 5.2 xhigh and even GPT-5.1 medium. Oh well.

Nr model_name lineage lineage-8 lineage-64 lineage-128 lineage-192
1 openai/gpt-5.1 (high) 0.969 1.000 0.975 0.975 0.925
2 openai/gpt-5.2 (xhigh) 0.962 1.000 1.000 0.925 0.925
3 openai/gpt-5.1 (medium) 0.888 1.000 0.950 0.875 0.725
4 openai/gpt-5.4 (xhigh) 0.881 1.000 1.000 0.750 0.775
5 openai/gpt-5.4 (high) 0.875 1.000 0.900 0.900 0.700
6 openai/gpt-5.2 (high) 0.494 1.000 0.700 0.175 0.100

Discussion in the ATmosphere

Loading comments...