Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihbmp2u4dgxh4tuafh57hbiepvynxunwms7z7fpqooeopo7p5vjpy",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mlf2tuxc5mw2"
  },
  "path": "/t/gpt-5-4-mini-66k-prompt-tokens-for-a-1920x1080-png/1380539#post_1",
  "publishedAt": "2026-05-09T00:20:43.000Z",
  "site": "https://community.openai.com",
  "textContent": "I’m sending a properly formatted `image_url` request to `gpt-5.4-mini` via the Chat Completions API. The image is a 1920x1080 PNG (~131KB), sent as a base64 data URL with `detail: high`.\n\nPer the Images and vision docs, `gpt-5.4-mini` uses patch-based image tokenization with a 1,536-patch budget and a 1.62x multiplier. A 1920x1080 image should cost approximately **2,400 prompt tokens**.\n\nInstead, I’m seeing **~66,000 prompt tokens**. This matches almost exactly what you’d get if the base64 string were tokenized as text (~131KB PNG → ~175KB base64 → ~66K text tokens at ~4 chars/token).\n\nAlso, the `prompt_tokens_details` in the API response contains no `image_tokens` field:\n\n\n    \"prompt_tokens_details\": {\n        \"audio_tokens\": 0,\n        \"cached_tokens\": 2304\n    }\n\n\nShouldn’t this show `image_tokens` if I’m sending an image?\n\n**Request payload** (base64 truncated):\n\n\n    {\n      \"model\": \"gpt-5.4-mini\",\n      \"max_completion_tokens\": 256,\n      \"messages\": [\n        {\n          \"role\": \"user\",\n          \"content\": [\n            {\n              \"type\": \"text\",\n              \"text\": \"What do you see? Reply in one sentence.\"\n            },\n            {\n              \"type\": \"image_url\",\n              \"image_url\": {\n                \"url\": \"data:image/png;base64,iVBORw...\",\n                \"detail\": \"high\"\n              }\n            }\n          ]\n        }\n      ]\n    }\n\n\nThe model _does_ understand the image. It returns a correct description of the screenshot contents. So the vision capability works, but tokenization/billing appears to fall back to treating the base64 as plain text.\n\nIs this a known issue with `gpt-5.4-mini` on the Chat Completions API?\n\nHere’s a minimal script that reproduces this, hitting the API directly (no SDK):\n\n\n    #!/usr/bin/env bash\n    set -euo pipefail\n\n    TMPFILE=$(mktemp)\n    trap 'rm -f \"$TMPFILE\"' EXIT\n\n    BASE64_IMAGE=$(base64 -w 0 \"<path_to_1080p_image>\")\n\n    cat > \"$TMPFILE\" <<EOF\n    {\n      \"model\": \"gpt-5.4-mini\",\n      \"max_completion_tokens\": 256,\n      \"messages\": [\n        {\n          \"role\": \"user\",\n          \"content\": [\n            {\"type\": \"text\", \"text\": \"What do you see? Reply in one sentence.\"},\n            {\n              \"type\": \"image_url\",\n              \"image_url\": {\n                \"url\": \"data:image/png;base64,${BASE64_IMAGE}\",\n                \"detail\": \"high\"\n              }\n            }\n          ]\n        }\n      ]\n    }\n    EOF\n\n    curl -s -w \"\\nHTTP_STATUS: %{http_code}\\n\" \\\n      https://api.openai.com/v1/chat/completions \\\n      -H \"Content-Type: application/json\" \\\n      -H \"Authorization: Bearer ${OPENAI_KEY}\" \\\n      -d @\"$TMPFILE\"\n\n\nAnd the response:\n\n\n    thiagolobo@nephtis-desktop:~/$ ./openai.sh\n    {\n      \"id\": \"chatcmpl-DdPgVQgzShKGIbQoqun4PyrV6zywL\",\n      \"object\": \"chat.completion\",\n      \"created\": 1778285895,\n      \"model\": \"gpt-5.4-mini-2026-03-17\",\n      \"choices\": [\n        {\n          \"index\": 0,\n          \"message\": {\n            \"role\": \"assistant\",\n            \"content\": \"A browser is open to the Van Zandt CAD online property tax search page with search fields and helpful hints visible.\",\n            \"refusal\": null,\n            \"annotations\": []\n          },\n          \"finish_reason\": \"stop\"\n        }\n      ],\n      \"usage\": {\n        \"prompt_tokens\": 65599,\n        \"completion_tokens\": 26,\n        \"total_tokens\": 65625,\n        \"prompt_tokens_details\": {\n          \"cached_tokens\": 1792,\n          \"audio_tokens\": 0\n        },\n        \"completion_tokens_details\": {\n          \"reasoning_tokens\": 0,\n          \"audio_tokens\": 0,\n          \"accepted_prediction_tokens\": 0,\n          \"rejected_prediction_tokens\": 0\n        }\n      },\n      \"service_tier\": \"default\",\n      \"system_fingerprint\": null\n    }\n\n    HTTP_STATUS: 200\n",
  "title": "Gpt-5.4-mini: 66K prompt tokens for a 1920x1080 PNG"
}