External Publication

GPT-5-mini image input token calculation discrepancy with official FAQ formula

OpenAI Developer Community June 17, 2026

Independently confirming this — I measured gpt-5-mini image input tokens directly off the API and get ~1.20 tokens/patch , matching the pricing calculator, not the docs’ 1.62.

Method: send the same text prompt with and without one image, then subtract the text-only usage.input_tokens from the image request’s — that isolates the image’s contribution. Patch count is ceil(w/32) × ceil(h/32).

image	patches	image input tokens	tokens ÷ patch
256×256	64	77	1.20
512×512	256	308	1.20
768×1024	768	922	1.20
1280×720	920	1104	1.20
1024×1024	1024	1229	1.20
2048×768	1536 (at cap)	1844	1.20

So for everything up to and including the 1536-patch cap, the billed/reported tokens are ceil(w/32) × ceil(h/32) × 1.20 — the documented 1.62 over-states actual usage by ~35% (1.62 ÷ 1.20 ≈ 1.35).

One thing I haven’t pinned down: the > 1536-patch regime, where the image is resized to fit the cap before the multiplier. The at-cap point (2048×768 = exactly 1536 patches) is clean at 1.20, but I haven’t characterized larger images precisely — and the 1800×1200 figure above (~2334) doesn’t land neatly on either 1.20 (≈1843) or 1.62 (≈2488), so that resize step may behave differently. Curious if anyone has clean numbers for images well over 1536 patches.

Discussion in the ATmosphere