External Publication
Visit Post

Responses API returns zero usage when combining `previous_response_id` + `context_management` + `tools`

OpenAI Developer Community March 4, 2026
Source

Hi there

We’re having difficulty tracking token usage for about half of our API requests. The Responses API returns {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0} in the usage field when a request includes all three of:

  1. previous_response_id (continuing a stored conversation that contains tool calls)
  2. context_management (e.g. [{"type": "compaction", "compact_threshold": 200000}])
  3. tools (any tool definitions)

The response itself is correct — the model reasons, makes tool calls, and produces output — but the reported usage is zero. Removing any one of the three parameters causes usage to report correctly.

Reproduction Steps

Tested with gpt-5.2 via the OpenAI Ruby gem. The bug is deterministic.

require "openai"
client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])

tool = {
  type: "function",
  name: "get_weather",
  description: "Get weather for a location",
  parameters: {
    type: "object",
    properties: { location: { type: "string" } },
    required: ["location"],
    additionalProperties: false
  },
  strict: true
}

# Step 1: Create a stored conversation with a tool call
r1 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: "What is the weather in Auckland?",
  store: true,
  tools: [tool],
  tool_choice: "auto"
})
# => usage: {"input_tokens"=>49, "output_tokens"=>34, "total_tokens"=>83}

tool_call = r1["output"].find { |o| o["type"] == "function_call" }

# Step 2: Return the tool result
r2 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: [{ type: "function_call_output", call_id: tool_call["call_id"], output: "Sunny 22C" }],
  store: true,
  previous_response_id: r1["id"],
  tools: [tool]
})
# => usage: {"input_tokens"=>99, "output_tokens"=>23, "total_tokens"=>122}

# Step 3: Continue with previous_response_id + context_management + tools
r3 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: "Thanks! What about Wellington?",
  store: true,
  previous_response_id: r2["id"],
  tools: [tool],
  context_management: [{ type: "compaction", compact_threshold: 200_000 }]
})
# => usage: {"input_tokens"=>0, "output_tokens"=>0, "total_tokens"=>0}
#    ^^^^^^^^ BUG: response contains real output but usage is zero

Isolation Matrix

Starting from Step 2 above, Step 3 was repeated with different parameter combinations:

previous_response_id context_management tools total_tokens
yes yes yes 0
yes yes no 7,617
yes no yes 130
no yes yes non-zero

The bug requires all three parameters together on a conversation containing tool call history

Expected Behaviour

The usage field should report actual token counts regardless of whether context_management is present. The model is clearly processing tokens (it produces output), so the usage should reflect that.

Environment

  • Model: gpt-5.2
  • API: Responses API (/v1/responses)
  • Client: openai Ruby gem
  • Date observed: 2026-03-03
  • Reproducible: 100% deterministic

Discussion in the ATmosphere

Loading comments...