{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreih5uzzn3g2bciskvkb5zucefciuyoxdvq2is5mw2a767glqgqlrqq",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mnr4yc7chxk2"
  },
  "path": "/t/unexpected-codex-5h-quota-exhaustion-on-pro-5x-gpt-5-3-codex-spark-context-window-failure/1383015#post_1",
  "publishedAt": "2026-06-08T05:57:18.000Z",
  "site": "https://community.openai.com",
  "textContent": "# Environment\n\n  * ChatGPT Pro 5X\n  * Codex Desktop (macOS)\n  * Same account/workspace across Desktop, Web, and CLI\n  * Default service tier\n  * Primary model during the affected work: GPT-5.5 (xhigh reasoning)\n  * Later diagnostic run: GPT-5.3 Codex Spark (xhigh reasoning)\n\n\n\n* * *\n\n## Issue A: Unexpected 5h Quota Exhaustion\n\nToday I unexpectedly exhausted my Codex 5-hour quota after what appeared to be a very small amount of work.\n\nFrom the user perspective:\n\n  * Only 3 active threads were used.\n  * Total active task runtime was under 20 minutes.\n  * Codex analytics showed very low daily thread activity\n  * No unusually large code generation jobs were performed.\n\n\n\nI’ve never hit the wall with much more intensive work. I did not expect this workload to come anywhere close to exhausting the quota window.\n\nAfter investigating local session logs, I found that one GPT-5.5 thread accumulated extremely large token counts.\n\nThread:\n\n019ea500-1d8b-7c90-881a-bded967f5aa9\n\n### Run 1\n\n  * GPT-5.5 (xhigh)\n  * Duration: 157 seconds\n  * Reported primary usage: 15%\n  * Total tokens by end of run: 777,268\n\n\n\n### Run 2\n\n  * GPT-5.5 (xhigh)\n  * Duration: 497 seconds\n  * Reported primary usage: 38%\n  * Total tokens by end of run: 5,290,376\n\n\n\nAt this point the thread had accumulated more than 5 million total tokens.\n\nHowever, before the next task, the account had already reached the 5-hour quota limit.\n\n* * *\n\n## Why This Is Confusing\n\nThe visible workload appeared very small:\n\n  * 3 threads\n  * Less than 20 minutes of active runtime\n  * No large-scale generation tasks\n\n\n\nYet backend token accumulation appears to have reached multi-million-token levels.\n\nWhat is not clear is how the following relate to one another:\n\n  * Total tokens\n  * Cached input tokens\n  * Primary usage %\n  * 5-hour quota consumption\n  * Pro 5X allowance\n\n\n\nIn particular:\n\n  * One run ended at approximately 5.29M total tokens while reporting only 38% primary usage.\n  * Before the next investigation run, the account was already at 100%.\n\n\n\nI would appreciate clarification on:\n\n  1. How is “primary %” calculated?\n  2. Does “primary %” scale according to subscription tier?\n  3. Does 38% mean 38% of the Pro 5X allowance?\n  4. How much do cached input tokens contribute to quota consumption?\n  5. Is total token accumulation directly related to the 5-hour quota?\n  6. Are there known discrepancies between visible usage indicators and backend quota accounting?\n\n\n\n* * *\n\n## Issue B: GPT-5.3 Codex Spark Context Window Failure\n\nAfter the quota issue occurred, I switched to GPT-5.3 Codex Spark to investigate the problem.\n\nThe task was extremely simple:\n\n> “Can you figure out why we suddenly hit the 5h usage limit of Codex?”\n\nSpark performed a few searches and inspections, then produced:\n\n> Context automatically compacted\n\nfollowed immediately by:\n\n> Your input exceeds the context window of this model. Please adjust your input and try again.\n\nNo meaningful analysis was completed before the context window was exhausted.\n\nNotably, this happened while attempting to diagnose the quota issue itself.\n\n* * *\n\n## Why This Seems Strange\n\nThe sequence was roughly:\n\n  1. Open a repository/workspace.\n  2. Ask Spark to investigate a quota issue.\n  3. Spark performs a handful of searches.\n  4. Context compaction triggers.\n  5. Context window is exceeded.\n  6. Task aborts without completing.\n\n\n\nThis was not a large coding task.\n\nIt was primarily repository inspection and log analysis.\n\n* * *\n\n## Questions About Spark\n\n  1. Is GPT-5.3 Codex Spark intended for repository-scale investigations?\n  2. Is Spark expected to automatically compact context successfully during repo analysis?\n  3. Are there recommended limits for:\n     * AGENTS.md size\n     * memory files\n     * operational notes\n     * workspace documentation\n     * session history\n  4. Are there known issues where Spark repeatedly re-reads large workspace documents and rapidly consumes context?\n  5. Is there a recommended workflow for using Spark as a troubleshooting or repository-investigation agent?\n\n\n\n* * *\n\n## Additional Context\n\nThis workspace contains:\n\n  * agent skills\n  * operational memory files\n  * workspace documentation\n  * automation notes\n  * agent-generated reports\n\n\n\nIt is possible that Spark is encountering a context-management edge case in repositories that contain large amounts of operational memory and documentation.\n\nIf Spark is not intended for this type of investigation, guidance on its expected scope would be very helpful.",
  "title": "Unexpected Codex 5h Quota Exhaustion on Pro 5X + GPT-5.3 Codex Spark Context Window Failure"
}