Batch API gets stuck `in_progress` how to debugg?
After changing a range of code files, prompts, and QA rules in my local pipeline, this Batch API issue started appearing. I’m having trouble narrowing down whether it is caused by my request structure, Batch queue behavior, prompt size, structured outputs, prompt caching fields, or something else.
I used ChatGPT to help summarize the debugging evidence, but the issue is from my own local API workflow.
Title: Batch API validates and enters in_progress, but multi-request batch stays 0/N; one-request batch works
Hi, I’m trying to troubleshoot an OpenAI Batch API issue without sharing private product text.
Setup:
Endpoint:
/v1/chat/completionsModel:
gpt-5.4-miniStructured output:
response_format: json_schema,strict: truereasoning_effort: lowmax_completion_tokens: 1600Request body also includes
prompt_cache_keyandprompt_cache_retention: "24h"
What works:
A direct synchronous request using the exact same request body works in ~7 seconds.
A one-request Batch using the exact same JSONL line completes successfully.
What does not work:
A 15-request Batch validates and moves to
in_progress, but stays at0/15completed for over an hour.No
error_file_id.No failed requests.
No validation error.
If I cancel it, the error file only contains
batch_cancelled, so it does not reveal a request-level issue.
Batch/request size:
15 requests in JSONL
Total JSONL size: ~292 KB
Per-request JSONL line size: ~15.8k–21.8k characters
Average line size: ~19.4k characters
Direct-one measured prompt tokens: 4,328
Direct-one output tokens: around 500–750
Rough total prompt tokens for 15 requests: ~65k
Account/model limits shown in Platform UI:
gpt-5.4-mini: 4,000,000 TPM, 5,000 RPM, 40,000,000 TPDBudget is not near exhausted
The UI has a “Batch queue limits” column, but I do not see a clear copied value for this model
Questions:
Can a multi-request Batch stay at
in_progress 0/Nbecause of Batch queue/enqueued-token behavior, even when a one-request Batch works?Are
prompt_cache_keyandprompt_cache_retentionsupported/safe inside Batch request bodies for/v1/chat/completions?Is there any way to get deeper diagnostics for a Batch that validates but never starts completing requests?
Would the recommended next test be:
remove prompt cache fields,
split into smaller batches,
reduce prompt/request size,
reduce
max_completion_tokens,or switch to
/v1/responses?
Are there known cases where strict structured outputs can cause hidden retries/stalls without an
error_file_id?
I can share a fully redacted JSONL line structure if needed, but not the private product text.
Discussion in the ATmosphere