{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreie2344mxqb23b22aamvax7qzwaqqkhdkivvl32udmroqmyfl36334",
"uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mgbfvyc2eps2"
},
"path": "/t/responses-api-returns-zero-usage-when-combining-previous-response-id-context-management-tools/1375726#post_1",
"publishedAt": "2026-03-04T23:00:57.000Z",
"site": "https://community.openai.com",
"textContent": "Hi there\n\nWe’re having difficulty tracking token usage for about half of our API requests. The Responses API returns `{\"input_tokens\": 0, \"output_tokens\": 0, \"total_tokens\": 0}` in the `usage` field when a request includes all three of:\n\n 1. `previous_response_id` (continuing a stored conversation that contains tool calls)\n 2. `context_management` (e.g. `[{\"type\": \"compaction\", \"compact_threshold\": 200000}]`)\n 3. `tools` (any tool definitions)\n\n\n\nThe response itself is correct — the model reasons, makes tool calls, and produces output — but the reported usage is zero. Removing any one of the three parameters causes usage to report correctly.\n\n## Reproduction Steps\n\nTested with `gpt-5.2` via the OpenAI Ruby gem. The bug is deterministic.\n\n\n require \"openai\"\n client = OpenAI::Client.new(access_token: ENV[\"OPENAI_API_KEY\"])\n\n tool = {\n type: \"function\",\n name: \"get_weather\",\n description: \"Get weather for a location\",\n parameters: {\n type: \"object\",\n properties: { location: { type: \"string\" } },\n required: [\"location\"],\n additionalProperties: false\n },\n strict: true\n }\n\n # Step 1: Create a stored conversation with a tool call\n r1 = client.responses.create(parameters: {\n model: \"gpt-5.2\",\n input: \"What is the weather in Auckland?\",\n store: true,\n tools: [tool],\n tool_choice: \"auto\"\n })\n # => usage: {\"input_tokens\"=>49, \"output_tokens\"=>34, \"total_tokens\"=>83}\n\n tool_call = r1[\"output\"].find { |o| o[\"type\"] == \"function_call\" }\n\n # Step 2: Return the tool result\n r2 = client.responses.create(parameters: {\n model: \"gpt-5.2\",\n input: [{ type: \"function_call_output\", call_id: tool_call[\"call_id\"], output: \"Sunny 22C\" }],\n store: true,\n previous_response_id: r1[\"id\"],\n tools: [tool]\n })\n # => usage: {\"input_tokens\"=>99, \"output_tokens\"=>23, \"total_tokens\"=>122}\n\n # Step 3: Continue with previous_response_id + context_management + tools\n r3 = client.responses.create(parameters: {\n model: \"gpt-5.2\",\n input: \"Thanks! What about Wellington?\",\n store: true,\n previous_response_id: r2[\"id\"],\n tools: [tool],\n context_management: [{ type: \"compaction\", compact_threshold: 200_000 }]\n })\n # => usage: {\"input_tokens\"=>0, \"output_tokens\"=>0, \"total_tokens\"=>0}\n # ^^^^^^^^ BUG: response contains real output but usage is zero\n\n\n## Isolation Matrix\n\nStarting from Step 2 above, Step 3 was repeated with different parameter combinations:\n\n`previous_response_id` | `context_management` | `tools` | `total_tokens`\n---|---|---|---\nyes | yes | yes | **0**\nyes | yes | no | 7,617\nyes | no | yes | 130\nno | yes | yes | non-zero\n\nThe bug requires all three parameters together on a conversation containing tool call history\n\n## Expected Behaviour\n\nThe `usage` field should report actual token counts regardless of whether `context_management` is present. The model is clearly processing tokens (it produces output), so the usage should reflect that.\n\n## Environment\n\n * Model: `gpt-5.2`\n * API: Responses API (`/v1/responses`)\n * Client: `openai` Ruby gem\n * Date observed: 2026-03-03\n * Reproducible: 100% deterministic\n\n",
"title": "Responses API returns zero usage when combining `previous_response_id` + `context_management` + `tools`"
}