[Responses API] GPT 5 ignores the detail parameter on image inputs
ruoda:
Has anyone tested if this new patch algorithm for token consumption actually consumes the expected tokens?
No, nobody has ever tested? Instead, nobody at OpenAI seems to fix billing issues after concrete reports in a timely manner.
Here is sending a 3000x2000 image to APIs today:
| model | vision | vision_mult | chat input | calculated | responses input | calculated |
|---|---|---|---|---|---|---|
| gpt-5.5 | patch | 1.2 | 2699 | 2243 | 2888 | 2400 |
| gpt-5.4 | patch | 1.2 | 2699 | 2243 | 2888 | 2400 |
| gpt-5.2 | patch | 1.2 | 3051 | 2536 | 3310 | 2752 |
| gpt-5.1 | tile | - | 917 | 910 | 917 | 910 |
Chat input : The vision request run on chat completions, usage report Responses input : Same request run on responses, usage report Calculated: after removing the message overhead and reversing the cost multiplier, the number of image tokens that seem to be billed, per endpoint.
The cost multiplier is nowhere in documentation for vision for GPT-5+ full models.
We can see that GPT-5.2 specifically is incredibly over-billing today - a number that used to reconcile against OpenAI’s own calculator, where GPT-5.2 was the last model that they give the courtesy of a price calculator and what the billed tokens for that image should be:
Other models seem to have the token cap of 2500 dictating a resize performing differently than the step-by-step documentation algorithm, and also, are different usage between endpoints for the same API call.
I have a calculator that previously was more accurate in providing token costs down to the token and subpixel of resize for even indeterminate cases than OpenAI’s.
hotnova.com
OpenAI Vision Token Calculator
It implements the exact “patches” formulation described for downsizing (and also will show the uncapped pricing of detail:“original”)
What we expect - in tokens, in inflation.
My calculator understands that you can send “detail”:“low” to a patches-based model, and get no effect.
Discussion in the ATmosphere