Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreie2344mxqb23b22aamvax7qzwaqqkhdkivvl32udmroqmyfl36334",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mgbfvyc2eps2"
  },
  "path": "/t/responses-api-returns-zero-usage-when-combining-previous-response-id-context-management-tools/1375726#post_1",
  "publishedAt": "2026-03-04T23:00:57.000Z",
  "site": "https://community.openai.com",
  "textContent": "Hi there\n\nWe’re having difficulty tracking token usage for about half of our API requests. The Responses API returns `{\"input_tokens\": 0, \"output_tokens\": 0, \"total_tokens\": 0}` in the `usage` field when a request includes all three of:\n\n  1. `previous_response_id` (continuing a stored conversation that contains tool calls)\n  2. `context_management` (e.g. `[{\"type\": \"compaction\", \"compact_threshold\": 200000}]`)\n  3. `tools` (any tool definitions)\n\n\n\nThe response itself is correct — the model reasons, makes tool calls, and produces output — but the reported usage is zero. Removing any one of the three parameters causes usage to report correctly.\n\n## Reproduction Steps\n\nTested with `gpt-5.2` via the OpenAI Ruby gem. The bug is deterministic.\n\n\n    require \"openai\"\n    client = OpenAI::Client.new(access_token: ENV[\"OPENAI_API_KEY\"])\n\n    tool = {\n      type: \"function\",\n      name: \"get_weather\",\n      description: \"Get weather for a location\",\n      parameters: {\n        type: \"object\",\n        properties: { location: { type: \"string\" } },\n        required: [\"location\"],\n        additionalProperties: false\n      },\n      strict: true\n    }\n\n    # Step 1: Create a stored conversation with a tool call\n    r1 = client.responses.create(parameters: {\n      model: \"gpt-5.2\",\n      input: \"What is the weather in Auckland?\",\n      store: true,\n      tools: [tool],\n      tool_choice: \"auto\"\n    })\n    # => usage: {\"input_tokens\"=>49, \"output_tokens\"=>34, \"total_tokens\"=>83}\n\n    tool_call = r1[\"output\"].find { |o| o[\"type\"] == \"function_call\" }\n\n    # Step 2: Return the tool result\n    r2 = client.responses.create(parameters: {\n      model: \"gpt-5.2\",\n      input: [{ type: \"function_call_output\", call_id: tool_call[\"call_id\"], output: \"Sunny 22C\" }],\n      store: true,\n      previous_response_id: r1[\"id\"],\n      tools: [tool]\n    })\n    # => usage: {\"input_tokens\"=>99, \"output_tokens\"=>23, \"total_tokens\"=>122}\n\n    # Step 3: Continue with previous_response_id + context_management + tools\n    r3 = client.responses.create(parameters: {\n      model: \"gpt-5.2\",\n      input: \"Thanks! What about Wellington?\",\n      store: true,\n      previous_response_id: r2[\"id\"],\n      tools: [tool],\n      context_management: [{ type: \"compaction\", compact_threshold: 200_000 }]\n    })\n    # => usage: {\"input_tokens\"=>0, \"output_tokens\"=>0, \"total_tokens\"=>0}\n    #    ^^^^^^^^ BUG: response contains real output but usage is zero\n\n\n## Isolation Matrix\n\nStarting from Step 2 above, Step 3 was repeated with different parameter combinations:\n\n`previous_response_id` | `context_management` | `tools` | `total_tokens`\n---|---|---|---\nyes | yes | yes | **0**\nyes | yes | no | 7,617\nyes | no | yes | 130\nno | yes | yes | non-zero\n\nThe bug requires all three parameters together on a conversation containing tool call history\n\n## Expected Behaviour\n\nThe `usage` field should report actual token counts regardless of whether `context_management` is present. The model is clearly processing tokens (it produces output), so the usage should reflect that.\n\n## Environment\n\n  * Model: `gpt-5.2`\n  * API: Responses API (`/v1/responses`)\n  * Client: `openai` Ruby gem\n  * Date observed: 2026-03-03\n  * Reproducible: 100% deterministic\n\n",
  "title": "Responses API returns zero usage when combining `previous_response_id` + `context_management` + `tools`"
}