{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreibbmdiksb6hndh4awsnpgjfdklgytzmmzpfkgypjvjey2e5oes5pm",
"uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mer46mcyttk2"
},
"path": "/t/best-practices-for-cost-efficient-high-quality-context-management-in-long-ai-chats/1373996#post_4",
"publishedAt": "2026-02-13T16:41:36.000Z",
"site": "https://community.openai.com",
"tags": [
"Codex"
],
"textContent": "I would also look at Codex - this has got particularly good of late.\n\nHere’s what I did:\n\n * I git cloned the repo\n * I launched codex inside the cloned repo … of codex (oooooh meta )\n * Asked it: “summarise how context is managed by this app for large code bases.”\n\n\n\nAnd it gave me this response\n\n\n - It stores conversation history as ResponseItems and builds each model prompt from normalized history + fixed session context (developer\n instructions, AGENTS/user instructions, environment context). See codex-rs/core/src/codex.rs:2493, codex-rs/core/src/codex.rs:2446, codex-rs/\n core/src/context_manager/history.rs:86.\n - It aggressively bounds noisy content: tool outputs are truncated (token/byte policy), long text is middle-truncated, and function output content\n items are budgeted. See codex-rs/core/src/context_manager/history.rs:327, codex-rs/core/src/truncate.rs:88, codex-rs/core/src/truncate.rs:100.\n - It tracks token usage and context window continuously (server usage + local estimates), then emits TokenCount updates to UI. See codex-rs/core/\n src/codex.rs:2603, codex-rs/protocol/src/protocol.rs:1443.\n - Auto-compaction triggers when usage crosses model threshold; threshold defaults to ~90% of model window (clamped) unless overridden. See codex-\n rs/core/src/codex.rs:4435, codex-rs/core/src/codex.rs:4491, codex-rs/protocol/src/openai_models.rs:259, codex-rs/core/src/config/mod.rs:158.\n - Compaction rewrites history into: initial context + recent user messages (capped) + a handoff summary, then resumes. Inline path: codex-rs/core/\n src/compact.rs:67, codex-rs/core/src/compact.rs:295. Remote path: codex-rs/core/src/compact_remote.rs:60.\n - UI context meter reserves baseline overhead (BASELINE_TOKENS=12000) so “% remaining” reflects user-controllable space, not fixed prompt/tool\n overhead. See codex-rs/protocol/src/protocol.rs:1478, codex-rs/protocol/src/protocol.rs:1512.\n - Repo exploration tools are intentionally scoped/limited (instead of full-repo ingest): read_file uses offset/limit and line-length caps,\n grep_files has result/time limits, fuzzy file search caps results. See codex-rs/core/src/tools/handlers/read_file.rs:31, codex-rs/core/src/tools/\n handlers/read_file.rs:19, codex-rs/core/src/tools/handlers/grep_files.rs:20, codex-rs/app-server/src/fuzzy_file_search.rs:16.\n\n Inference from code search: I don’t see a built-in embedding/vector “index entire codebase” pipeline in core; context is assembled from history +\n targeted tool reads.\n\n\nGiven its performance, I’d wager this is state of the art …",
"title": "Best practices for cost-efficient, high-quality context management in long AI chats"
}