Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicspfm3a74i5hjgijfsil2glgtzmsapa32w23bthei3gmybodfn2y",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mlo474rncyq2"
  },
  "path": "/t/batch-api-with-o3-deep-research-spawning-duplicates/1380738#post_2",
  "publishedAt": "2026-05-12T14:59:16.000Z",
  "site": "https://community.openai.com",
  "textContent": "Substantial new evidence after rotating the API key. The bug reproduces cleanly on the new key, with no 429s involved, and is much more serious than my original report suggested. The previous “429 retry loop” framing was\n\nwrong. Below is what just happened.\n\n**Setup**\n\n  * Old sk-proj-…0xIW2XgA key rotated this afternoon, all spend stopped briefly.\n\n  * On the new key I submitted exactly one batch to verify the fix: batch_6a0334d99ce481908a3c9cc7e9a4399c, slug wyoming, call 1, 1 line, endpoint: /v1/responses, o3-deep-research-2025-06-26.\n\n  * Fresh quota window. No 429s. No client retries.\n\n\n\n\n**What happened**\n\nThat single batch line produced **3 separate, successful, billed Deep Research completions** in under 30 minutes.\n\nAll three have status: completed, error: null, incomplete_details: null. All three have metadata: {} because the batch envelope’s metadata does not propagate into spawned response objects.\n\n**The smoking gun is the dispatch timing**\n\n  * Dispatch #2 was created at 14:23:11, **3 minutes 58 seconds before** Dispatch #1 completed at 14:27:13. So #2 was not dispatched in response to #1 failing or timing out — #1 was still running successfully.\n\n  * Dispatch #3 was created at 14:29:11, **1 minute 58 seconds after** Dispatch #1 had already completed and presumably reported back. So #3 was not dispatched because the batch worker thought #1 was missing.\n\n\n\n\nThis is spontaneous duplication of a successful, in-flight or already-completed batch line.\n\n**Billing impact for this one batch**\n\n  * Token cost (50% batch discount applied): $2.75\n\n  * Tool cost (107 web searches × $0.025, no discount): $2.68\n\n  * **Total billed: ~$5.43**\n\n  * Expected cost with a single dispatch: ~$1.81\n\n  * **~3x overspend on a single batch line**\n\n\n\n\nMultiplied across the 27 batches I submitted earlier today (which were also fanning out, then masked by a 429 retry loop on top), this explains today’s full $47.12 spend without any bug in my code.\n\n**Cancel behavior is still broken too**\n\n  * Cancel issued at 14:50:42 UTC, req_6d6d39b89ebf4f388d282ba21791081d, returned HTTP 200 status: cancelling.\n\n  * Four more Arizona batches cancelled in the same minute (req_6e2b9f00c36d4746988bb300533083d2, req_a6de1175c3ed42dca7108a83ddf84c36, req_0cece57e96ac446ab6115df8418941a4, req_4823a5e3065e4261a4d59aa4f3e901be).\n\n  * All 5 still in cancelling with no cancelled_at more than 20 minutes later. Same stuck-cancel pattern as the morning batches.\n\n\n\n\n**Other persistent oddities**\n\n  * The Wyoming batch’s request_counts reads {total: 1, completed: 0, failed: 0} despite 3 fully-billed completions executing under it. Counters are not tracking reality.\n\n  * output_file_id is null. We were billed ~$5.43 for work whose outputs are not delivered through the documented batch retrieval path. To recover the markdown we have to pull each resp_… individually by ID.\n\n  * All three resp_… objects show background: false and metadata: {}, with no field linking them back to the parent batch. The connection only exists in OpenAI’s internal logs.\n\n\n\n\n**Requests**\n\n1. **Engineering escalation** : a single batch line producing multiple billed completions, on a clean key with no 429s, is a serious correctness bug. Please escalate to the Batch API team.\n\n2. **Force-cancel** these 5 batches: batch_6a0334d99ce481908a3c9cc7e9a4399c, batch_6a033b6ac91c81909999c3a124aafc4b, batch_6a033b6ab20c8190aa2f6b5311d1a0ad, batch_6a033b6abfbc8190bb2803d7e629713c,\n\nbatch_6a033b6a12c8819096db4a2036e5e5cd. They are stuck in cancelling.\n\n3. **Credit** today’s o3-deep-research-2025-06-26 spend down to the legitimate single-dispatch cost of the 28 lines submitted. Current spend is $47.12; legitimate cost is at most ~$50 if every line had succeeded once, but in\n\nreality many lines produced no usable output through the batch path, so the appropriate refund covers any execution beyond the first per submitted line.\n\n4. **Bug to file** : the spawned /v1/responses objects should inherit the batch envelope’s metadata. Right now a customer hitting this bug has no way to attribute spawned responses to their batch, which makes triage take hours.",
  "title": "Batch API with o3-deep-research spawning duplicates"
}