{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreibbmdiksb6hndh4awsnpgjfdklgytzmmzpfkgypjvjey2e5oes5pm",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mer46mcyttk2"
  },
  "path": "/t/best-practices-for-cost-efficient-high-quality-context-management-in-long-ai-chats/1373996#post_4",
  "publishedAt": "2026-02-13T16:41:36.000Z",
  "site": "https://community.openai.com",
  "tags": [
    "Codex"
  ],
  "textContent": "I would also look at Codex - this has got particularly good of late.\n\nHere’s what I did:\n\n  * I git cloned the repo\n  * I launched codex inside the cloned repo … of codex (oooooh meta  )\n  * Asked it: “summarise how context is managed by this app for large code bases.”\n\n\n\nAnd it gave me this response\n\n\n      - It stores conversation history as ResponseItems and builds each model prompt from normalized history + fixed session context (developer\n        instructions, AGENTS/user instructions, environment context). See codex-rs/core/src/codex.rs:2493, codex-rs/core/src/codex.rs:2446, codex-rs/\n        core/src/context_manager/history.rs:86.\n      - It aggressively bounds noisy content: tool outputs are truncated (token/byte policy), long text is middle-truncated, and function output content\n        items are budgeted. See codex-rs/core/src/context_manager/history.rs:327, codex-rs/core/src/truncate.rs:88, codex-rs/core/src/truncate.rs:100.\n      - It tracks token usage and context window continuously (server usage + local estimates), then emits TokenCount updates to UI. See codex-rs/core/\n        src/codex.rs:2603, codex-rs/protocol/src/protocol.rs:1443.\n      - Auto-compaction triggers when usage crosses model threshold; threshold defaults to ~90% of model window (clamped) unless overridden. See codex-\n        rs/core/src/codex.rs:4435, codex-rs/core/src/codex.rs:4491, codex-rs/protocol/src/openai_models.rs:259, codex-rs/core/src/config/mod.rs:158.\n      - Compaction rewrites history into: initial context + recent user messages (capped) + a handoff summary, then resumes. Inline path: codex-rs/core/\n        src/compact.rs:67, codex-rs/core/src/compact.rs:295. Remote path: codex-rs/core/src/compact_remote.rs:60.\n      - UI context meter reserves baseline overhead (BASELINE_TOKENS=12000) so “% remaining” reflects user-controllable space, not fixed prompt/tool\n        overhead. See codex-rs/protocol/src/protocol.rs:1478, codex-rs/protocol/src/protocol.rs:1512.\n      - Repo exploration tools are intentionally scoped/limited (instead of full-repo ingest): read_file uses offset/limit and line-length caps,\n        grep_files has result/time limits, fuzzy file search caps results. See codex-rs/core/src/tools/handlers/read_file.rs:31, codex-rs/core/src/tools/\n        handlers/read_file.rs:19, codex-rs/core/src/tools/handlers/grep_files.rs:20, codex-rs/app-server/src/fuzzy_file_search.rs:16.\n\n      Inference from code search: I don’t see a built-in embedding/vector “index entire codebase” pipeline in core; context is assembled from history +\n      targeted tool reads.\n\n\nGiven its performance, I’d wager this is state of the art …",
  "title": "Best practices for cost-efficient, high-quality context management in long AI chats"
}