External Publication

Responses API returns zero usage when combining `previous_response_id` + `context_management` + `tools`

OpenAI Developer Community March 4, 2026

Hi there

We’re having difficulty tracking token usage for about half of our API requests. The Responses API returns {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0} in the usage field when a request includes all three of:

previous_response_id (continuing a stored conversation that contains tool calls)
context_management (e.g. [{"type": "compaction", "compact_threshold": 200000}])
tools (any tool definitions)

The response itself is correct — the model reasons, makes tool calls, and produces output — but the reported usage is zero. Removing any one of the three parameters causes usage to report correctly.

Reproduction Steps

Tested with gpt-5.2 via the OpenAI Ruby gem. The bug is deterministic.

require "openai"
client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])

tool = {
  type: "function",
  name: "get_weather",
  description: "Get weather for a location",
  parameters: {
    type: "object",
    properties: { location: { type: "string" } },
    required: ["location"],
    additionalProperties: false
  },
  strict: true
}

# Step 1: Create a stored conversation with a tool call
r1 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: "What is the weather in Auckland?",
  store: true,
  tools: [tool],
  tool_choice: "auto"
})
# => usage: {"input_tokens"=>49, "output_tokens"=>34, "total_tokens"=>83}

tool_call = r1["output"].find { |o| o["type"] == "function_call" }

# Step 2: Return the tool result
r2 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: [{ type: "function_call_output", call_id: tool_call["call_id"], output: "Sunny 22C" }],
  store: true,
  previous_response_id: r1["id"],
  tools: [tool]
})
# => usage: {"input_tokens"=>99, "output_tokens"=>23, "total_tokens"=>122}

# Step 3: Continue with previous_response_id + context_management + tools
r3 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: "Thanks! What about Wellington?",
  store: true,
  previous_response_id: r2["id"],
  tools: [tool],
  context_management: [{ type: "compaction", compact_threshold: 200_000 }]
})
# => usage: {"input_tokens"=>0, "output_tokens"=>0, "total_tokens"=>0}
#    ^^^^^^^^ BUG: response contains real output but usage is zero

Isolation Matrix

Starting from Step 2 above, Step 3 was repeated with different parameter combinations:

`previous_response_id`	`context_management`	`tools`	`total_tokens`
yes	yes	yes	0
yes	yes	no	7,617
yes	no	yes	130
no	yes	yes	non-zero

The bug requires all three parameters together on a conversation containing tool call history

Expected Behaviour

The usage field should report actual token counts regardless of whether context_management is present. The model is clearly processing tokens (it produces output), so the usage should reflect that.

Environment

Model: gpt-5.2
API: Responses API (/v1/responses)
Client: openai Ruby gem
Date observed: 2026-03-03
Reproducible: 100% deterministic

Reproduction Steps

Isolation Matrix

Expected Behaviour

Environment

Discussion in the ATmosphere